Abstract
Nucleophilicity and electrophilicity dictate the reactivity of polar organic reactions. In the past decades, Mayr et al. established a quantitative scale for nucleophilicity (N) and electrophilicity (E), which proved to be useful tools for the rationalization of chemical reactivity. In this study, a holistic prediction model was developed through a machine-learning approach. rSPOC, an ensemble molecular representation with structural, physicochemical, and solvent features, was developed for this purpose. With 1115 nucleophiles, 285 electrophiles and 22 solvents, the dataset was currently the largest one for reactivity prediction. The rSPOC model trained with the Extra Trees algorithm showed high accuracy in predicting Mayr’s N and E parameters with R2 of 0.96 and 0.92, MAE of 0.99 and 1.47, respectively. Furthermore, the practical applications of the model, for instance, nucleophilicity prediction of NAD(P)H and a series of enamines showed potential in predicting molecules with unknown reactivity within seconds. An online prediction platform (http://isyn.luoszgroup.com/) was constructed based on the current model, which is available free to the scientific community.