University of Cambridge > Talks.cam > NLIP Seminar Series > Text-to-text Generation Beyond Machine Translation

Text-to-text Generation Beyond Machine Translation

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Amandla Mabona.

In recent years we have witnessed the achievements of sequence-to-sequence encoder-decoder models for machine translation. It is no surprise that these models are also setting a trend in various other generation tasks such as dialogue generation, image caption generation, sentence compression, paraphrase generation, sentence simplification and document summarization. Yet, despite their impressive results, these deep learning sequence models are often applied off-the-shelf to these text-to-text generation tasks.

In this talk I will discuss two examples, sentence simplification and document summarization, that explore the hypothesis that tailoring the model with knowledge of the task structure and linguistic requirements leads to better performance. In the first part, I will propose a new sentence simplification task (split-and-rephrase) where the aim is to split a complex sentence into a meaning preserving sequence of shorter sentences. I will show that the semantically-motivated split model is a key factor in generating fluent and meaning preserving rephrasings. In the second part, I will discuss the shortcomings of sequence-to-sequence abstractive methods for document summarization and show that an extractive summarization system trained to globally optimize a common summarization evaluation metric outperforms state-of-the-art extractive and abstractive systems in both automatic and extensive human evaluations.

BIO : Shashi Narayan is a postdoctoral researcher in the School of Informatics at the University of Edinburgh. He obtained his PhD in Computer Science at the University of Lorraine, INRIA under Claire Gardent in 2014. His research focuses on natural language generation and understanding with an aim to develop general frameworks for generation from underlying meaning representation or for text rewriting such as summarization, text simplification and paraphrase generation. He also has experience with parsing and other structured prediction problems.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity