BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Reducing Speaker and Temporal Redundancy in Discrete Speech Tokeni
 zation - Yiwei Guo\, Shanghai Jiaotong University
DTSTART:20251020T150000Z
DTEND:20251020T160000Z
UID:TALK237541@talks.cam.ac.uk
CONTACT:Brian Sun
DESCRIPTION:Discrete speech tokens have emerged as a fundamental represent
 ation for various downstream speech processing tasks\, particularly in spe
 ech generation. However\, most existing tokens encode dense\, fixed-rate a
 coustic information\, which introduces substantial redundancy and limits t
 heir efficiency. In this talk\, I will first provide a brief review on the
  taxonomy of current discrete speech tokens\, then present our works explo
 ring the reduction of this information redundancy in two critical directio
 ns:\n(1) Speaker timbre disentanglement\, introducing a low-bitrate\, sing
 le-codebook and speaker-decoupled codec for speech.\n(2) Variable-rate tem
 poral compression\, exploring methods to dynamically adjust the frame rate
  of discrete tokens for better compactness and bitrate-performance tradeof
 f.\nTogether\, these efforts highlight pathways toward more efficient and 
 controllable discrete speech representations\, paving the way for the next
  generation of speech technologies.\n
LOCATION:Hybrid: Cambridge University Engineering Department\, LT6 or Zoom
 : https://cam-ac-uk.zoom.us/j/83363886282?pwd=eCJD8vmmxI29vSZMX92lmQjWWUgZ
 gw.1
END:VEVENT
END:VCALENDAR
