University of Cambridge > Talks.cam > Language Technology Lab Seminars > Multiword Expressions and Compositionality Detection: Giving Word Embeddings a Hard Time

Multiword Expressions and Compositionality Detection: Giving Word Embeddings a Hard Time

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Mohammad Taher Pilehvar.

In this talk I start with an overview of Multiword Expressions (MWEs) like compound nouns and verb particle constructions, which have proved a challenge for computational analysis. These expressions need to be treated as a unit at some level of linguistic description. In particular, they display a wide range of compositionality, from more compositional cases like police car to more idiomatic MWEs like kick the bucket. Models for representing words and MWEs in semantic space, and their ability to capture compositionality/idiomaticity will be compared for three languages: English, French and Portuguese. The impact of some factors like the degree of corpus pre-processing and the size of context for the performance of these models will be discussed. I discuss the findings of a large-scale multilingual evaluation of DSMs for predicting the degree of semantic compositionality of nominal compounds on 4 datasets for English and French.

This talk is part of the Language Technology Lab Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity