University of Cambridge > Talks.cam > Machine learning in Physics, Chemistry and Materials discussion group (MLDG) > Unsupervised attention-guided atom-mapping

Unsupervised attention-guided atom-mapping

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Bingqing Cheng .

Language models called transformers have recently revolutionized natural language processing and show great potential when applied to text-based representations of chemical reactions. The patterns in chemical reactions are learned by predicting masked parts of reaction SMILES . The pretrained models can then be specialized on a task like reaction classification [1]and yield predictions [2], where they reach unprecedented accuracies. Not only can specific outputs of the transformer models serve as fingerprints to map the chemical reaction space without the need of knowing the reaction center or distinguishing between reactants and reagents, but they can also be used to recover the rearrangement between reactant and product atoms [3]. By opening the black-box using detailed visual analysis, we discovered that the transformer models learned atom-mapping without supervision. Atom-mapping is necessary for making chemical reaction data better machine-accessible and crucial for graph- and template-based reaction prediction and synthesis planning approaches. Here, we present an attention-guided reaction mapper that shows remarkable performance in terms of speed and accuracy, even for strongly imbalanced reactions as typically found in patents. This work is the first demonstration of knowledge extraction from a self-supervised language model with a direct practical application in the chemical reaction domain.

References: [1] Mapping the Space of Chemical Reactions using Attention-Based Neural Networks P Schwaller, D Probst, AC Vaucher, VH Nair, D Kreutter, T Laino, JL Reymond http://dx.doi.org/10.26434/chemrxiv.9897365

[2] Prediction of Chemical Reaction Yields using Deep Learning P Schwaller, AC Vaucher, T Laino, JL Reymond http://dx.doi.org/10.26434/chemrxiv.12758474

[3] RXN Mapper: Unsupervised Attention-Guided Atom-Mapping. P Schwaller, B Hoover, JL Reymond, H Strobelt, T Laino http://dx.doi.org/10.26434/chemrxiv.9897365

This talk is part of the Machine learning in Physics, Chemistry and Materials discussion group (MLDG) series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity