University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Exponential Family Embeddings

Exponential Family Embeddings

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact INI IT.

SNAW01 - Graph limits and statistics

Word embeddings are a powerful approach for capturing semantic  similarity among terms in a vocabulary.  In this talk, I will describe exponential family embeddings, a class of methods that  extends the idea of word embeddings to other types of  high-dimensional data. As examples, we studied neural data with real-valued observations, count data from a market basket analysis, and ratings data from a movie recommendation system.  We then extended this idea to networks, giving a novel type of “latent space” model.  The main idea behind an EF-EMB is to model each observation conditioned on a set of other  observations.  This set is called the context, and the way the  context is defined is a modeling choice that depends on the problem.  In language the context is the surrounding words; in neuroscience  the context is close-by neurons; in market basket data the context is other items in the shopping cart; in networks the context is edges emanating from a node pair.  Each type of embedding model  defines the context, the exponential family of conditional  distributions, and how the latent embedding vectors are shared  across data. We infer the embeddings with a scalable algorithm based  on stochastic gradient descent.  We found exponential family embedding models to be more effective than other types of dimension reduction.  They better reconstruct held-out data and find interesting qualitative  structure.

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity