This thesis presents a collection of statistical models that attempt to take advantage of every piece of prior knowledge available to provide the models with as much structure as possible. The main motivation for introducing these models is interpretability since in practice we want to be able to use them as hypothesis generating tools. All of our models start from a family of structures, for instance factor models, directed acyclic graphs, classifiers, etc. Then we let them be selectively sparse as a way to provide them with structural fl exibility and interpretability. Finally, we complement them with different prior assumptions in order to make them appropriate at handling different domain specific situations as time series, non-linearities, batch effects, missing values, etc. In particular, we present a framework for linear Bayesian networks we call sparse identifiable multivariate modeling, a model for peptide-protein/protein-protein interactions called latent protein tree, a framework for sparse Gaussian process classification based on active set selection and a linear multi-category sparse classifier specially targeted to gene expression data. The thesis is organized to provide a general yet self-contained description of every model in terms of generative assumptions, interpretability goals, probabilistic formulation and target applications. Case studies, benchmark results and practical details are also provided as appendices published elsewhere, containing reprints of peer reviewed material.