Machine learning to predict protein function from sequence with therapeutic applications
- đ¤ Speaker: Dr Lucy J. Colwell, Yusuf Hamied Department of Chemistry, University of Cambridge
- đ Date & Time: Monday 20 February 2023, 14:30 - 15:00
- đ Venue: Pfizer Lecture Theatre, Department of Chemistry
Abstract
A central challenge is to predict the functional properties of a protein from its sequence, and thus (i) discover new proteins with specific functionality and (ii) better understand the functional effect of genomic mutations. Experimental and computational data enable powerful machine learning models that predict protein function directly from sequence to be trained and validated. I will present deep learning models that accurately predict functional domains within protein sequences, and large language models that generate textual descriptions of protein sequences, collectively adding millions of annotations to public databases. Experimental breakthroughs enable data on the relationship between sequence and function to be rapidly acquired. However, the cost and latency of wet-lab experiments require methods that find good sequences in few experimental rounds, where each round contains a large batch of sequence designs. In this setting, I will discuss model-based optimization approaches that take advantage of sample inefficient methods to find diverse sequence candidates for experimental evaluation. The potential of these approaches are illustrated through three case studies demonstrating the design and experimental validation of proteins and peptides for therapeutic applications.
Series This talk is part of the Lennard-Jones Centre series.
Included in Lists
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Monday 20 February 2023, 14:30-15:00