Transformer: the 3rd generation neural network acoustic models for ASR and its application at Facebook
- đ¤ Speaker: Yongqiang Wang, Facebook
- đ Date & Time: Monday 30 November 2020, 15:00 - 16:00
- đ Venue: Zoom: https://zoom.us/j/94591123432?pwd=bUJObFZ3UnFYLy9pWENDcS9aYUZqUT09
Abstract
Since the introduction of deep learning to automatic speech recognition (ASR), neural network architectures have evolved rapidly from feed-forward networks to recurrent networks. Recently, in natural language processing are, transformer network-based sequence modeling has demonstrated strong results over recurrent network-based one, in terms of both modeling accuracy and inference speed. However, it is non-trivial to adopt the transformer architecture in speech recognition due to the unique requirement in ASR like streaming processing. In this talk, we showed that how the transformer architecture can be modified to fit different latency requirements for a range of speech applications. Specifically, we augmented the attention module in transformer with a set of memory slots, results in an efficient memory transformer, Emformer. We compare our Emformer with LSTM -based acoustic model under both low latency and medium latency scenarios, on the widely used librispeech benchmark and a series of industrial scale tasks, whose training data ranges from 9K hours to 2.2M hours. We showed that on the medium latency tasks, Emformer provides 10-20% error reduction and 2-3x inference speed up; on the low latency task, Emformer achieved similar word rate reduction at a cost of slightly increased real time factors (RTF). By presenting these results, we hope that we can convince the audience that transformer could become the third generation of neural acoustic model for both traditional hybrid and end-to-end ASR systems.
Series This talk is part of the CUED Speech Group Seminars series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- CUED Speech Group Seminars
- Guy Emerson's list
- Information Engineering Division seminar list
- PhD related
- Zoom: https://zoom.us/j/94591123432?pwd=bUJObFZ3UnFYLy9pWENDcS9aYUZqUT09
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Monday 30 November 2020, 15:00-16:00