University of Cambridge > Talks.cam > Artificial Intelligence Research Group Talks (Computer Laboratory) > Text-and-audio methods

Log in

University Account

External (via Google)

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Text-and-audio methods

Download to your calendar using vCal

Cătălina Cangea, ex-Google DeepMind
Tuesday 04 February 2025, 13:00-14:00
Lecture Theatre 2, Computer Laboratory, William Gates Building.

If you have a question about this talk, please contact Mateja Jamnik .

This talk supports the R255 Advanced Topics in Machine Learning module on Multimodal Learning and provides a bird’s eye view of the rapidly evolving text-audio landscape, with a focus on music as a primary example of audio data. I will first present types of tasks that exist in this space, then discuss data curation challenges and follow with an overview of some existing retrieval and generation methods, including a quick primer on diffusion models. Finally, I will describe current evaluation metrics and their limitations.

You can also join us on Zoom

This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Abstract

You can also join us on Zoom

Log in

🔐 Log In

Information on

ℹ️ Information

Text-and-audio methods

This talk is included in these lists:

Text-and-audio methods

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Text-and-audio methods

This talk is included in these lists:

Other lists

Other talks

Text-and-audio methods

Abstract

Included in Lists