University of Cambridge > > NLIP Seminar Series > General-Purpose Representation Learning from Words to Sentences

General-Purpose Representation Learning from Words to Sentences

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Kris Cao.


Real-valued vector representations of words (aka embeddings) that are trained on naturally occurring data by optimising general-purpose objectives are useful for a range of downstream language tasks. However, the picture is less clear for larger linguistic units such as phrases or sentences. Phrases and sentences typically encode the facts and propositions that constitute the ‘general knowledge’ missing from many NLP systems at present, so the potential benefit of making representation-learning work for these units is huge. I will present a systematic comparison of different ways of inducing such representations with neural language models from unlabelled data. The study demonstrates clear and interesting differences between the representations learned by different methods; in particular, more elaborate or computationally expensive methods are not necessarily best. I’ll also discuss a key challenge facing all research in unsupervised or representation learning for NLP - the lack of robust evaluations.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2021, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity