Bayesian nonparametric latent feature models
- đ¤ Speaker: Zoubin Ghahramani
- đ Date & Time: Wednesday 02 May 2007, 14:00 - 15:00
- đ Venue: TCM Seminar Room, Cavendish Laboratory, Department of Physics
Abstract
Latent variables are an important component of many statistical models. Most latent variable models, such as mixture models, factor analysis, and independent components analysis (ICA), assume a finite, usually small number of latent variables. However, it may be statistically undesirable to constrain the number of latent variables a priori. Here we show how a more flexible nonparametric approach is possible in which the number of latent variables is unbounded. To do this, we describe a probability distribution over equivalence classes of binary matrices with a finite number of rows, corresponding to the data points, and an unbounded number of columns, corresponding to the latent variables. Each data point can be associated with a subset of the possible latent variables, which we refer to as the latent features of that data point. The binary variables in the matrix indicate which latent feature is possessed by which data point, and there is a potentially infinite array of features. We derive the distribution over unbounded binary matrices by taking the limit of a distribution over $N \times K$ binary matrices as $K \rightarrow \infty$, a strategy inspired by the derivation of the Chinese restaurant process (Aldous, 1985; Pitman, 2002) which preserves exchangeability of the rows. We define a simple generative processes for this distribution which we call the Indian buffet process (IBP; Griffiths and Ghahramani, 2005). We describe recent extensions of this model, Markov chain Monte Carlo algorithms for inference, and a number of applications to collaborative filtering, independent components analysis, bioinformatics, cognitive modelling, and causal discovery.
Joint work with Thomas L. Griffiths (UC Berkeley).
Series This talk is part of the Inference Group series.
Included in Lists
- All Cavendish Laboratory Seminars
- All Talks (aka the CURE list)
- Biology
- Cambridge Neuroscience Seminars
- Cambridge talks
- Centre for Health Leadership and Enterprise
- Chris Davis' list
- dh539
- dh539
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group
- Inference Group Summary
- Interested Talks
- Joint Machine Learning Seminars
- Life Science
- Life Sciences
- Machine Learning Summary
- ME Seminar
- ML
- Neurons, Fake News, DNA and your iPhone: The Mathematics of Information
- Neuroscience
- Neuroscience Seminars
- Neuroscience Seminars
- Required lists for MLG
- rp587
- School of Physical Sciences
- Stem Cells & Regenerative Medicine
- TCM Seminar Room, Cavendish Laboratory, Department of Physics
- Thin Film Magnetic Talks
- yk373's list
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Wednesday 02 May 2007, 14:00-15:00