Abstract
In the field of medicinal chemistry, the discovery of novel compounds with therapeutic potential is of utmost importance. However, the conventional approach to drug discovery, relying on high-throughput screening (HTS), encounters limitations such as reagent availability and labor costs. Although a hit-finding campaign starts with a virtual screen of millions of compounds, the hit-to-lead and lead optimization stages often require designing, synthesizing, and profiling thousands of analogs before selecting clinical candidates. Recent advances offer an innovative solution - the virtual generation of billions of synthetically feasible compounds with an impressive 80% success rate for synthesis. This breakthrough has the potential to significantly expand the chemical space available for purchase and experimental validation, bringing about a revolution in the field. Our study presents a comprehensive approach to compound discovery and optimization, harnessing quantitative high-throughput screening (qHTS), chemical databases, and reaction-based enumeration. To further enhance the synthesis process and gain deeper insights into compound formation, we utilize the Reaction Cookbook from Biosolveit, which comprises reaction SMARTs for around 300 chemical reactions. By employing this wide range of reactions, we aim to uncover the full spectrum of a chemotype's potential and customize its structure to optimize desired properties. As an illustration of this approach, our work on the ALDH3A1 project resulted in the synthesis of 50 compounds, with 21 of them exhibiting activity (negative curve class values; hit-rate ~42%). Among these active compounds, 6 displayed IC50 values lower than 30 µM and efficacy values less than -50%, with the most potent compound achieving an impressive potency of 35 nM. This study demonstrates the successful synergy between in-silico reaction-based analogs enumeration, molecular modeling and AI/ML-based techniques in identifying compounds with improved biological activity, offering promising prospects for the development of ALDH3A1-targeting agents as potential cancer therapeutics. Computational workflows developed in this study can be used for similar target-based drug discovery campaigns.