Clustering Dynamics in Mean-Field Models of Transformers
- 👤 Speaker: Andrea Agazzi (Universität Bern)
- 📅 Date & Time: Tuesday 26 August 2025, 10:30 - 11:30
- 📍 Venue: Seminar Room 2, Newton Institute
Abstract
Transformers are a central architecture in modern deep learning, forming the backbone of large language models such as ChatGPT. In this talk, I will present a mathematical framework for studying how information—represented as “tokens”—evolves through the layers of such neural networks. Specifically, we consider a family of partial differential equations that describe how the distribution of tokens—modeled as particles interacting in a mean-field way—changes with depth. Numerical experiments reveal that, under certain conditions, these dynamics exhibit a metastable clustering phenomenon, where tokens group into well-separated clusters that evolve slowly over time. A rigorous analysis of this behavior uncovers a range of open questions and unexpected connections to analysis and geometry.
Series This talk is part of the Isaac Newton Institute Seminar Series series.
Included in Lists
- All CMS events
- bld31
- dh539
- Featured lists
- INI info aggregator
- Isaac Newton Institute Seminar Series
- School of Physical Sciences
- Seminar Room 2, Newton Institute
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Andrea Agazzi (Universität Bern)
Tuesday 26 August 2025, 10:30-11:30