HCat-GNet: An Interpretable Graph Neural Network for Catalysis Optimization

22 February 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Homogeneous catalysts enable faster conversions of molecules with higher selectivities (stereo- and regioselectivity) in chemical reactions. Traditionally, catalyst improvements are made through empirical trials, where the catalyst is functionalised by adding, removing or modifying groups within its structure and, subsequently, reevaluating the new catalytic activity. This procedure is not efficient and leads to unsuccessful trials that waste resources. Machine learning (ML) approaches have been proposed to accelerate homogeneous asymmetric catalyst optimization. However, these often lack a general descriptor generation procedure to allow encoding of molecules from a broad region of chemical space. To overcome this, we propose a homogeneous catalyst graph neural network (HCat-GNet) for the prediction of selectivity of catalysts given the SMILES of participant molecules. We demonstrate its use in rhodium-catalyzed asymmetric 1,4-addition (RhCAA), a reaction of major importance in organic synthesis. We benchmark HCat-GNet against traditional ML methods for its ability to predict RhCAA stereoselectivity from two chiral diene ligand two datasets; one for learning and one for final testing. For the learning dataset, both traditional ML and HCat-GNet methods give comparable results. However, when presented with the new unseen test dataset, traditional ML models perform poorly, while HCat-GNet retains a general ability to accurately predict product absolute stereochemistry and reaction stereoselectivity. Furthermore, HCat-GNet allows model interpretability, permitting analysis of the effect of ligand substituents in determining reaction selectivity. HCat-GNet shows greater potential for catalyst optimization than traditional ML, as it allows the use of a non-fixed number of participant molecules to train the model, only requiring the SMILES of the molecules to create graph representations. HCat-GNet allows more general models that accurately extrapolate into unseen regions of chemical space.

Keywords

Graph neural networks
Asymmetric catalysis

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.