University of Cambridge > Talks.cam > CUED Speech Group Seminars > Generative Speech Separation based on Pitch Information

Generative Speech Separation based on Pitch Information

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Dr Jie Pu.

This talk will be on zoom

Abstract: Monaural speech separation aims to separate concurrent speakers from a single-microphone mixture recording. Inspired by auditory scene analysis mechanisms, a generative speech separation framework based on pitch information will be presented in this talk. The prominent advantage of this framework is that both the permutation problem and the unknown speaker number problem existing in general models can be solved by using pitch contours to indicate the target speaker to be separated. In addition, the generative approach is applied instead of traditional time-frequency mask based approach, to improve the perceptual quality of separated speech. Specifically, the proposed framework can be divided into two phases: pitch extraction and speech separation. The former aims to accurately extract pitch contour candidates for each speaker from the mixture, where a two-stage approach is presented. Any pitch contour can be selected as the condition at the second phase, and a conditional generative adversarial network (CGAN) is used to separate the speaker corresponding to the given pitch condition. The proposed framework is evaluated in terms of pitch extraction as well as speech separation.

Bio: Xiang Li is a Research Associate in the Speech Group of the Machine Intelligence Laboratory, Engineering Department of Cambridge University, worked with Prof. Mark Gales. She recently received her PhD from Peking University, supervised by Prof. Xihong Wu. This talk is about her PhD thesis. Her research interests include speech enhancement/separation, perception and natural language processing.

This talk is part of the CUED Speech Group Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity