University of Cambridge > > Statistics > Advances in Bayesian Latent Factor Modelling: The Incredible Shrinking Model

Advances in Bayesian Latent Factor Modelling: The Incredible Shrinking Model

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact R.B.Gramacy.

Recent advances in the methodology and application of Bayesian latent factor models are discussed. Sparse latent factor modelling, in which the now-standard latent factor modelling framework is coupled with so-called sparsity, or variable selection, prior distributions on regression parameters, has found significant application in the analysis of data falling under the ``large p, small n’’ heading. This includes applications in genomics and finance, for which a primary goal is to characterize variation in a very high-dimensional response in terms of a concise set of factors with sparse loadings. A useful consequence of sparsity in factor loadings is the increased opportunity for ascribing specific scientific interpretation to the latent factors. For example, analysis of gene expression microarrays in cancer studies has uncovered latent factors that have been demonstrated to be useful for predicting certain clinical outcomes. The loadings associated with such factors therefore represent signatures, in response-space, of those outcomes.

To the extent that signatures with meaningful interpretations are thought to represent underlying structural features of the system under study, such as groups of genes whose patterns of co-expression have causal relationships with known phenotypes, in many cases prior belief about latent factor structure in new data should be informed by posterior inferences drawn previously on similar data. We extend the current class of sparse latent factor models to include informative variable selection priors on both latent factor loadings and latent factors. Through this approach, previously inferred signatures are projected onto and refined by new data. Examples from cancer genomics demonstrate how this sort of targeted latent factor search improves the scientific interpretability of the model, and allows a direct query of the contribution of a known signature towards explaining variation in new data.

More generally, this methodology defines a flexible class of factor models that through the model fitting process ``shrinks’’ to the minimum number of covariates and latent factors supported by the data. In order to achieve total shrinkage, we describe a new type of nonparametric variable selection prior for the variance components of the model. The prior incorporates a hierarchical Dirichlet process to induce clustering of variance parameters within sample groups, while borrowing information across groups to estimate mixture components. When a control group is present, the prior for non-control group variances places additional probablility on a point mass located at the control variance. In this way, variances are clustered both within and across sample groups, yielding a parsimonious representation of the degree of heteroscedasticity in the data. Hence, the incredible shrinking model.

This talk is part of the Statistics series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2017, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity