Does Syntax Still Matter in the World of LLMs?
- 👤 Speaker: Miloš Stanojević (DeepMind) 🔗 Website
- 📅 Date & Time: Friday 06 October 2023, 12:00 - 13:00
- 📍 Venue: Computer Laboratory, room SS03
Abstract
Abstract:
Large Language Models (LLMs) have shown impressive results in a recent period to the extent that some cognitive scientists are claiming that syntactic theories should be abandoned as an explanation of human language in favour of LLMs. I will provide evidence that syntax is still beneficial both in scientific and engineering pursuits with human language. First, LLMs do not provide a prediction nor an explanation of what are the universal properties of all human languages, unlike the syntactic theory considered here. Second, human brain activity of some brain regions can be accounted for better by an incremental syntactic parser than by a LLM surprisal. Finally, LLMs can work even better if augmented with a syntactic compositional structure. If that is so, you might ask, why is syntax not more popular in NLP then? I believe it is because the modern hardware accelerators (GPUs and TPUs) are not optimal for tree-like computation so it is difficult to train large scale syntactic models. To account for that we have created a JAX library, called SynJAX, that makes it easier to build syntactic models that run efficiently on GPU /TPU.
Bio:
Miloš Stanojević is a Senior Research Scientist in Google DeepMind. Prior to that he did a PostDoc at the University of Edinburgh with Mark Steedman where he worked on Combinatory Categorial Grammars (CCG), and collaborated with Ed Stabler on Minimalist Grammars. He has received a PhD degree from University of Amsterdam for the work on machine translation. His main research interest is in bridging the gap between theoretical linguistics and natural language processing by bringing the right inductive biases to the machine learning models of language.
Series This talk is part of the NLIP Seminar Series series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Computer Education Research
- Computer Laboratory, room SS03
- Computing Education Research
- Department of Computer Science and Technology talks and seminars
- Graduate-Seminars
- Guy Emerson's list
- Interested Talks
- Language Sciences for Graduate Students
- ndk22's list
- NLIP Seminar Series
- ob366-ai4er
- PMRFPS's
- rp587
- School of Technology
- Simon Baker's List
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Miloš Stanojević (DeepMind) 
Friday 06 October 2023, 12:00-13:00