Abstract
Reaction barriers are key to our understanding of chemical reactivity and catalysis. Certain reactions are so seminal in chemistry, that countless variants, with or without catalysts, have been studied and their barriers have been computed or measured experimentally. This wealth of data represents a perfect opportunity to leverage machine learning models, which could quickly predict barriers without explicit calculations or measurement. Here, we show that the topological descriptors of the quantum mechanical charge density in the reactant state constitute a set that is both rigorous and continuous, and can be used effectively for prediction of reaction barrier energies to a high degree of accuracy. We demonstrate this on the Diels-Alder reaction, highly important in biology and medicinal chemistry, and as such, studied extensively. This reaction exhibits a range of barriers as large as 270 kJ/mol. While we trained our single-objective supervised (labeled) regression algorithms on simpler Diels-Alder reactions in solution, they predict reaction barriers also in significantly more complicated contexts, such a Diels-Alder reaction catalyzed by an artificial enzyme and its evolved variants, in agreement with experimental changes in kcat. We expect this tool to apply broadly to a variety of reactions in solution or in the presence of a catalyst, for screening and circumventing heavily involved computations or experiments.