University of Cambridge > Talks.cam > Michael Picheny (NUY) Speech Recognition: What's Left?

Michael Picheny (NUY) Speech Recognition: What's Left?

Add to your list(s) Send you e-mail reminders

ABSTRACT : Recent speech recognition advances on the SWITCHBOARD corpus suggest that because of recent advances in Deep Learning, we now achieve Word Error Rates comparable to human listeners. Does this mean the speech recognition problem is solved and the community can move on to a different set of problems? In this talk, we examine speech recognition issues that still plague the community and compare and contrast them to what is known about human perception. We specifically highlight issues in accented speech, noisy/reverberant speech, speaking style, rapid adaptation to new domains, and multilingual speech recognition. We try to demonstrate that compared to human perception, there is still much room for improvement, so significant work in speech recognition research is still required from the community. BIO : Dr Michael Picheny has worked in the Speech Recognition area since 1981, joining IBM after finishing his doctorate at MIT . He was heavily involved in the development of almost all of IBM ’s recognition systems, ranging from the world’s first real-time large vocabulary discrete system in 1984 through IBM ’s product lines for telephony and embedded systems in the 1990s, and most recently was responsible for putting out a set of Speech Services for both Speech Recognition and Speech Synthesis during his tenure in IBM ’s Watson Group. He has published numerous papers in both journals and conferences on almost all aspects of speech recognition (see web page for details). He is the co-holder of over 50 patents and was named a Master Inventor by IBM in 1995 and again in 2000. In addition to professional volunteer service (as indicated below), he served multiple times as an Adjunct Professor in the Electrical Engineering Department of Columbia University and co-taught a course in speech recognition. Michael is a Fellow of both the IEEE and of ISCA .

Michael was a manager for 35 years in the Speech area at IBM , and led the Speech team in Yorktown Heights since 2007. He just retired from IBM and joined NYU -Courant Computer Science and the Center for Data Science as a part-time Research Professor. At NYU , he hopes to continue speech recognition research and focus on problems dealing with challenging types of speech problems such as accented and disfluent speech, and rapid domain adaptation, as well as looking into cross-modality synergies involving text and vision.

Other views and ways to subscribe

You can include this list in your own website. Read the Instructions on how to include a list in your site and then click on 'Create Custom View' above to get started.

List Managers

Each talk has an organiser. Please contact them in the first instance if you have a query about a particular talk. Only contact one of the people below if you have a question about the list, such as whether your talk or series could be added.

(In order to see the manager's details, such as their e-mail, you will need to have an account and log in)

Lists included in this list

This list does not include any other list

Lists that include this list

This list is not included in any other list

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity