University of Cambridge > Talks.cam > NLIP Seminar Series > Doubt thy models: rethinking hypothesis testing in NLP

Doubt thy models: rethinking hypothesis testing in NLP

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact James Thorne.

Rescheduled

Recent years have seen the rise of machine learning models in NLP research, which are applied inter alia, to research on questions motivated by linguistic theory. Indeed, it has now become relatively easy to model and to test research problems. The ease with which models can be deployed comes at the risk of careless use, which may potentially lead to unreliable findings and ultimately even hinder our ability to extend our knowledge. Such misuse may stem, for example, from unfamiliarity with the assumptions and hypotheses that are implicit to the models, or inherent confounds that demand experimental controls. In this talk, I will focus on problems that are specific to linguistically-motivated questions (e.g., semantic change), but also to classical NLP research more generally, (e.g., polysemy resolution and representation), where word embeddings are the prominent ML models. Major problems include biases induced by word frequency, similarity estimation of noisy word vector representations, and the evaluation of models’ performance in the absence of properly validated evaluation tasks in general. I will suggest ways to mitigate some of these problems, and share some ideas about performing valid scientific research in the age of all-to-easy modeling.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2019 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity