University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Contributed Talk: Discovery of Spin-Crossover Metal-Organic Frameworks from Limited and Noisy Data using Quantile Active Learning - Virtual Presentation

Contributed Talk: Discovery of Spin-Crossover Metal-Organic Frameworks from Limited and Noisy Data using Quantile Active Learning - Virtual Presentation

Download to your calendar using vCal

If you have a question about this talk, please contact nobody.

RCLW05 - AI Across Scales: From Molecules to Planet Earth

Data-driven materials discovery is often hindered when target properties are computationally expensive or experimentally demanding to obtain, making conventional large-scale screening impractical. This challenge is particularly acute for metal–organic frameworks (MOFs), whose vast chemical diversity and complex electronic behavior demand both accuracy and data efficiency. Here, we present a unified active learning strategy based on regression tree methods to accelerate the discovery of functional MOFs under scarce, noisy, and imbalanced data conditions. Using low-dimensional, physically motivated descriptors derived from stoichiometric and geometric features, we construct regression tree–based partitions of the descriptor space to actively select the most diverse and informative samples for electronic-structure evaluation. This new approach, that we name Regression Tree–Active Learning [1], is demonstrated across multiple MOF datasets, where it yields compact training sets that outperform existing active learning strategies in predicting band gaps, adsorption properties, and other key materials descriptors, while exhibiting reduced variance and enhanced robustness to uneven label distributions [2]. We further apply this framework to the discovery of spin-crossover (SCO) MOFs, a rare but technologically promising subclass relevant for sensing, spintronics, and gas-related applications. By coupling a new Quantile Regression Tree–Active Learning approach with Random Forest regression and new density functional theory calculations, necessary to predict this property, we accurately identify SCO -active candidates from limited and imperfect training data, recovering over 80% of true positives. This strategy enables the identification of a new set of high-confidence SCO MO Fs, demonstrating that complex quantum phenomena can be reliably uncovered through data-efficient, actively guided exploration of large materials spaces [2].   [1] Data Min Knowl Disc (2023); [2] J. Am. Chem. Soc. 2024, 146, 9, 6134–6144, https://doi.org/10.1021/jacs.3c13687; [2] npj Comput. Mater., submitted.

This talk is part of the Isaac Newton Institute Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity