Perplexity AI: Under the Hood of LLM Inference
- đ¤ Speaker: Nandor Licker
- đ Date & Time: Monday 13 October 2025, 13:05 - 13:55
- đ Venue: FW26, William Gates Building
Abstract
Abstract: Perplexity is a search and answer engine which leverages LLMs to provide high-quality citation-backed answers. The AI Inference team within the company is responsible for serving the models behind the product, ranging from single-GPU embedding models to multi-node sparse Mixture-of-Experts language models. This talk provides more insight into the in-house runtime behind inference at Perplexity, with a particular focus on efficiently serving some of the largest available open-source models.
Biography:Nandor Licker is an AI Inference Engineer at Perplexity, focusing on LLM runtime implementation and GPU performance optimization.
Register for the talk at the following link: https://luma.com/dx1ggxgk
Some catering will be provided after the talk.
Series This talk is part of the Technical Talks - Department of Computer Science and Technology series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Chris Davis' list
- Department of Computer Science and Technology talks and seminars
- FW26, William Gates Building
- Guy Emerson's list
- Interested Talks
- ndk22's list
- ob366-ai4er
- rp587
- School of Technology
- Talks cs
- Technical Talks - Department of Computer Science and Technology
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Nandor Licker
Monday 13 October 2025, 13:05-13:55