Machine Translation with LSTMs
- đ¤ Speaker: Ilya Sutskever (Google) đ Website
- đ Date & Time: Friday 28 November 2014, 10:30 - 11:30
- đ Venue: Engineering Department, LR3B
Abstract
Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this talk, I will present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. The method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. The main result is that on an English to French translation task from the WMT -14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM âs BLEU score is penalized on out-of-vocabulary words. While this performance is respectable, it is worse than state of the art performance on this dataset (which is 37.0) mainly due to the LSTM ’s inability to translate out-of-vocabulary (OOV) words. In the second half of the talk, I will present a simple method for addressing the OOV problem. The method consists of annotating each OOV word in the training set with a “pointer” to its origin in the source sentence, which makes it easy to translate the OOV words at test time using a dictionary. The new method achieves a BELU score of 37.5, which is a new state-of-the-art.
This is joint work with Thang Luong, Oriol Vinyals, Quoc Le, an Wojciech Zaremba.
Series This talk is part of the Machine Learning @ CUED series.
Included in Lists
- All Talks (aka the CURE list)
- Biology
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge Neuroscience Seminars
- Cambridge talks
- CBL important
- Chris Davis' list
- Creating transparent intact animal organs for high-resolution 3D deep-tissue imaging
- dh539
- dh539
- Engineering Department - **LR3B**
- Engineering Department, LR3B
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group Summary
- Information Engineering Division seminar list
- Interested Talks
- Joint Machine Learning Seminars
- Life Science
- Life Sciences
- Machine Learning @ CUED
- Machine Learning Summary
- ML
- ndk22's list
- Neuroscience
- Neuroscience Seminars
- Neuroscience Seminars
- ob366-ai4er
- Required lists for MLG
- rp587
- Seminar
- Simon Baker's List
- Stem Cells & Regenerative Medicine
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)



Friday 28 November 2014, 10:30-11:30