University of Cambridge > Talks.cam > CUED Speech Group Seminars > Confidence estimation for attention-based encoder-decoder models for speech recognition

Confidence estimation for attention-based encoder-decoder models for speech recognition

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Dr Jie Pu.

This talk will be both online (zoom) and offline (CBL Meeting Room)

Abstract: Confidence scores have been an intrinsic part of a conventional speech recogniser. As end-to-end ASR models such as attention-based encoder-decoder models become increasingly popular, it is of great interest to develop reliable confidence estimators for various downstream tasks. In this talk, I will present the confidence estimation module (CEM) for token/word-level confidence scores, and the residual energy-based model (R-EBM) for utterance-level confidence scores for attention-based models. Interestingly, R-EBM can also help improve the ASR performance. Furthermore, some effective techniques for generalising these model-based confidence estimators to out-of-domain data will be discussed.

Bio: Qiujia Li is a fourth-year PhD student at the University of Cambridge, advised by Prof. Phil Woodland. He obtained his BA and MEng also from Cambridge University. His research interests lie primarily in speech processing and machine learning, including end-to-end speech recognition, confidence estimation and speaker diarization. He has published more than a dozen papers at ICASSP , Interspeech, SLT , ASRU, NeurIPS and ICCV , of which two won the best student paper awards at ASRU 2019 and SLT 2021 . He previously worked as a research intern with Microsoft in 2018 and Google in 2020.

This talk is part of the CUED Speech Group Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity