University of Cambridge > Talks.cam > DAMTP Data Intensive Science Seminar > Where neural scaling laws come from: a model-based theory of data structure

Log in

University Account

External (via Google)

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Where neural scaling laws come from: a model-based theory of data structure

Download to your calendar using vCal

Francesco Cagnetta, Marie Skłodowska-Curie Fellow at SISSA, Trieste
Tuesday 10 February 2026, 14:00-15:00
DAMTP, room MR4.

If you have a question about this talk, please contact Sven Krippendorf .

Neural scaling laws reveal strikingly robust power-law relationships between the performance of language models and the amount of training data. Yet, a principled explanation of where the scaling exponent comes from—in terms of measurable properties of real data, rather than solvable surrogates that neglect representation learning effects—has remained elusive. In this talk, I introduce a model-based perspective on data structure grounded in random hierarchies: analytically tractable generative models designed to capture the hierarchical and compositional structure of natural language while retaining explicit control over important learning-related statistics. I will then present new work that, building on this framework, ties the scaling exponent observed in autoregressive language modelling to two fundamental, empirically accessible statistics of text: (i) how correlations between two tokens decay with their separation t, and (ii) how the conditional entropy of the next token decreases as a function of context length n. The core message is that the representation-learning mechanism we identified by studying how deep learning methods learn random hierarchies provides the missing link from these descriptive statistics to quantitative predictions, as it yields a concrete formula for the scaling exponent in terms of the joint behaviour of these curves. The resulting prediction matches observed scaling remarkably well for modern neural architectures trained on large text corpora. This provides, to our knowledge, the first theory of neural scaling that depends only on intrinsic properties of the data and remains predictive in the regime of contemporary language modelling.

This talk is part of the DAMTP Data Intensive Science Seminar series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Where neural scaling laws come from: a model-based theory of data structure

📅 Download to calendar (vCal)

👤 Speaker: Francesco Cagnetta, Marie Skłodowska-Curie Fellow at SISSA, Trieste
📅 Date & Time: Tuesday 10 February 2026, 14:00 - 15:00
📍 Venue: DAMTP, room MR4

Questions? Contact Sven Krippendorf

Abstract

Series This talk is part of the DAMTP Data Intensive Science Seminar series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

Where neural scaling laws come from: a model-based theory of data structure

This talk is included in these lists:

Where neural scaling laws come from: a model-based theory of data structure

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Where neural scaling laws come from: a model-based theory of data structure

This talk is included in these lists:

Other lists

Other talks

Where neural scaling laws come from: a model-based theory of data structure

Abstract

Included in Lists