Abstract
Machine learning (ML) is gaining momentum in chemistry for the prediction of various molecular properties, or for the generation of novel molecules with specific properties. However, these models can only be trained on relatively scarce, often low-quality data. Thus, memorization (rather than learn-ing) may result in poorly generalizable models. To address this issue, we aimed to revisit the way ML is practiced in chemistry. Using pKa prediction as an example, we present our strategy which involves imparting Chemistry knowledge to ML algorithms. We posit that teaching fundamental principles (e.g., electronegativity and inductive effect) to machines to predict properties (e.g., pKa), analogous to the way we teach students, will allow them to predict more advanced, yet related, properties. Thus, ML will leverage the chemists’ knowledge and qualitative principles to quantify and predict chemical properties.
Supplementary materials
Title
Supporting information
Description
Additional information and discussion.
Actions