University of Cambridge > Talks.cam > Technical Talks - Department of Computer Science and Technology > Perplexity AI: Under the Hood of LLM Inference

Log in

University Account

External (via Google)

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Perplexity AI: Under the Hood of LLM Inference

Download to your calendar using vCal

Nandor Licker
Monday 13 October 2025, 13:05-13:55
FW26, William Gates Building.

If you have a question about this talk, please contact Ben Karniely .

Abstract: Perplexity is a search and answer engine which leverages LLMs to provide high-quality citation-backed answers. The AI Inference team within the company is responsible for serving the models behind the product, ranging from single-GPU embedding models to multi-node sparse Mixture-of-Experts language models. This talk provides more insight into the in-house runtime behind inference at Perplexity, with a particular focus on efficiently serving some of the largest available open-source models.

Biography:Nandor Licker is an AI Inference Engineer at Perplexity, focusing on LLM runtime implementation and GPU performance optimization.

Some catering will be provided after the talk.

This talk is part of the Technical Talks - Department of Computer Science and Technology series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Abstract

Biography:Nandor Licker is an AI Inference Engineer at Perplexity, focusing on LLM runtime implementation and GPU performance optimization.

Some catering will be provided after the talk.

Log in

🔐 Log In

Information on

ℹ️ Information

Perplexity AI: Under the Hood of LLM Inference

This talk is included in these lists:

Perplexity AI: Under the Hood of LLM Inference

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Perplexity AI: Under the Hood of LLM Inference

This talk is included in these lists:

Other lists

Other talks

Perplexity AI: Under the Hood of LLM Inference

Abstract

Included in Lists