Abstract
Late-stage functionalization (LSF) is an economical approach to optimize the properties of drug candidates. However, the chemical complexity of drug molecules often makes late-stage diversification challenging. To address this problem, an LSF platform based on geometric deep learning and high-throughput reaction screening was developed. Considering borylation as a critical step in LSF, the computational model predicted reaction yields for diverse reaction conditions with a mean absolute error margin of 4–5%, while the reactivity of novel reactions with known and unknown substrates were classified with a balanced accuracy of 92% and 67%, respectively. The regioselectivity of the major products was accurately captured in up to 90% of the cases studied. When applied to 23 diverse commercial drug molecules, the platform successfully identified numerous opportunities for structural diversification. The influence of steric and quantum mechanical information on model performance was quantified and a new comprehensive simple user-friendly reaction format (SURF) is introduced which proved to be a key enabler for seamlessly integrating deep learning and high-throughput experimentation (HTE) for LSF.
Supplementary materials
Title
Supplementary Information: Enabling late-stage drug diversification by high-throughput experimentation with geometric deep learning
Description
Supplementary information to the main manuscript.
Actions