Abstract
Efficient drug discovery relies on accessing diverse small molecules expediently and reliably. Improvements to reliability through machine learning predictions are hampered by poor availability of high-quality reaction data. Here, we introduce an on-demand synthesis platform based on a three-component reaction that delivers drug-like molecules overnight. Miniaturization and automation enable the execution and analysis of 50,000 reactions on a 3 microliter scale with distinct substrates, producing the largest public reaction outcome dataset. With machine learning, we accurately predict the result of unknown reactions and analyze the impact of data set size on model training. This study advances the on-demand synthesis of drug-like molecules through concatenating chemoselective reactions and provides a sufficiently large data set to critically evaluate emerging machine learning approaches to predicting chemical reactivity.
Supplementary materials
Title
Supporting Information
Description
Methods, Supplementary Figures, and Characterization Data
Actions
Title
Movie S1
Description
Source plate preparation by OT1 automated liquid handler.
Actions
Title
Movie S2
Description
Automated plate handling and acoustic dispensing.
Actions
Supplementary weblinks
Title
Reaction Heatmap
Description
Interactive website showing the ~40,000 reactions performed and the primary products obtained.
Actions
View