Text-and-audio methods
- ๐ค Speaker: Cฤtฤlina Cangea, ex-Google DeepMind ๐ Website
- ๐ Date & Time: Tuesday 04 February 2025, 13:00 - 14:00
- ๐ Venue: Lecture Theatre 2, Computer Laboratory, William Gates Building
Abstract
This talk supports the R255 Advanced Topics in Machine Learning module on Multimodal Learning and provides a birdโs eye view of the rapidly evolving text-audio landscape, with a focus on music as a primary example of audio data. I will first present types of tasks that exist in this space, then discuss data curation challenges and follow with an overview of some existing retrieval and generation methods, including a quick primer on diffusion models. Finally, I will describe current evaluation metrics and their limitations.
Series This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series.
Included in Lists
- All Talks (aka the CURE list)
- Artificial Intelligence Research Group Talks (Computer Laboratory)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Department of Computer Science and Technology talks and seminars
- Guy Emerson's list
- Hanchen DaDaDash
- Interested Talks
- Lecture Theatre 2, Computer Laboratory, William Gates Building
- Martin's interesting talks
- ndk22's list
- ob366-ai4er
- PhD related
- rp587
- School of Technology
- Speech Seminars
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)



Tuesday 04 February 2025, 13:00-14:00