University of Cambridge > Talks.cam > CUED Speech Group Seminars > Reducing Speaker and Temporal Redundancy in Discrete Speech Tokenization

Log in

University Account

External (via Google)

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Reducing Speaker and Temporal Redundancy in Discrete Speech Tokenization

Download to your calendar using vCal

Yiwei Guo, Shanghai Jiaotong University
Monday 20 October 2025, 16:00-17:00
Hybrid: Cambridge University Engineering Department, LT6 or Zoom: https://cam-ac-uk.zoom.us/j/83363886282?pwd=eCJD8vmmxI29vSZMX92lmQjWWUgZgw.1.

If you have a question about this talk, please contact Brian Sun .

Discrete speech tokens have emerged as a fundamental representation for various downstream speech processing tasks, particularly in speech generation. However, most existing tokens encode dense, fixed-rate acoustic information, which introduces substantial redundancy and limits their efficiency. In this talk, I will first provide a brief review on the taxonomy of current discrete speech tokens, then present our works exploring the reduction of this information redundancy in two critical directions: (1) Speaker timbre disentanglement, introducing a low-bitrate, single-codebook and speaker-decoupled codec for speech. (2) Variable-rate temporal compression, exploring methods to dynamically adjust the frame rate of discrete tokens for better compactness and bitrate-performance tradeoff. Together, these efforts highlight pathways toward more efficient and controllable discrete speech representations, paving the way for the next generation of speech technologies.

This talk is part of the CUED Speech Group Seminars series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Reducing Speaker and Temporal Redundancy in Discrete Speech Tokenization

📅 Download to calendar (vCal)

👤 Speaker: Yiwei Guo, Shanghai Jiaotong University
📅 Date & Time: Monday 20 October 2025, 16:00 - 17:00
📍 Venue: Hybrid: Cambridge University Engineering Department, LT6 or Zoom: https://cam-ac-uk.zoom.us/j/83363886282?pwd=eCJD8vmmxI29vSZMX92lmQjWWUgZgw.1

Questions? Contact Brian Sun

Abstract

Series This talk is part of the CUED Speech Group Seminars series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

Reducing Speaker and Temporal Redundancy in Discrete Speech Tokenization

This talk is included in these lists:

Reducing Speaker and Temporal Redundancy in Discrete Speech Tokenization

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Reducing Speaker and Temporal Redundancy in Discrete Speech Tokenization

This talk is included in these lists:

Other lists

Other talks

Reducing Speaker and Temporal Redundancy in Discrete Speech Tokenization

Abstract

Included in Lists