Challenges in evaluating natural language generation systems
- đ¤ Speaker: Mohit Iyyer (University of Massachusetts Amherst)
- đ Date & Time: Friday 11 June 2021, 13:00 - 14:00
- đ Venue: Virtual (Zoom)
Abstract
Join Zoom Meeting https://cl-cam-ac-uk.zoom.us/j/91900396241?pwd=Wk5mcDYrUytkSElkMHB0T3NkNkRFQT09
Meeting ID: 919 0039 6241 Passcode: 127570
Recent advances in neural language modeling have opened up a variety of exciting new text generation applications. However, evaluating systems built for these tasks remains difficult. Most prior work relies on a combination of automatic metrics such as BLEU (which are often uninformative) and crowdsourced human evaluation (which are also usually uninformative, especially when conducted without careful task design). In this talk, I focus on two specific applications: (1) unsupervised sentence-level style transfer and (2) long-form question answering. I will go over our recent work on building models for these systems and then describe the ensuing struggles to properly compare them to baselines. In both cases, we identify (and propose solutions for) issues with existing evaluations, including improper aggregation of multiple metrics, missing control experiments with simple baselines, and high cognitive load placed on human evaluators. I’ll conclude by briefly discussing our work on machine-in-the-loop text generation systems, in which both humans and machines participate in the generation process, where reliable human evaluation becomes much more feasible.
Series This talk is part of the NLIP Seminar Series series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Computer Education Research
- Computing Education Research
- Department of Computer Science and Technology talks and seminars
- Graduate-Seminars
- Guy Emerson's list
- Interested Talks
- Language Sciences for Graduate Students
- ndk22's list
- NLIP Seminar Series
- ob366-ai4er
- PMRFPS's
- rp587
- School of Technology
- Simon Baker's List
- Trust & Technology Initiative - interesting events
- Virtual (Zoom)
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Mohit Iyyer (University of Massachusetts Amherst)
Friday 11 June 2021, 13:00-14:00