Abstract
The current surge of artificial intelligence(AI) has sparked a renaissance in the field of computer-aided synthesis planning(CASP). Understanding regioselectivity during CASP has long been a crucial yet unsolved problem. Precisely predicting regioselective effects imparts designing high-yielding synthetic routes with minimal separation and material costs. However, it is still an emerging state to combine chemical knowledge and data-driven methods to make practical predictions for regioselectivity. At the same time, metal-catalyzed cross-coupling reactions have profoundly transformed medicinal chemistry, and thus become one of the most frequently encountered types of reactions in CASP. In this work, we introduce a data-driven framework that directly identifies the intrinsic major products for metal-catalyzed cross-coupling reactions, with chemical knowledge-informed message-passing neural networks(MPNNs). Integrating both first principle methods and data-driven methods, our model achieves an overall accuracy of 95.32\% on the test set of eight typical metal-catalyzed cross-coupling reaction types, including Suzuki-Miyaura, Stille, Sonogashira, Buchwald-Hartwig, Hiyama, Kumada, Negishi, and Heck reactions. Notably under practical scenarios, our model outperforms 6 experimental organic chemists with an average working experience of over 10 years in the organic synthesis industry. We have also developed a free web-based AI-empowered tool to assist general chemists in making prompt decisions about regioselectivity. Our code and web tool have been made available at https://github.com/Chemlex-AI/regioselectivity and https://ai.tools.chemlex.com/region-choose, respectively.
Supplementary materials
Title
Regio-MPNN: Predicting Regioselectivity for General Metal-Catalyzed Cross-Coupling Reactions using Chemical Knowledge Informed Message Passing Neural Network
Description
Supporting Information for Regio-MPNN: Predicting Regioselectivity for General Metal-Catalyzed Cross-Coupling Reactions using Chemical Knowledge Informed Message Passing Neural Network
Actions