Abstract
Generative chemical language models have demonstrated success in learning language-based molecular representations for de novo drug design. Here, we integrate structure-based drug design (SBDD) principles with chemical language models to present a modern hit-finding workflow to go from protein structure to novel small-molecule ligands, without a priori knowledge of ligand chemistry. Using Augmented Hill-Climb we successfully optimised multiple objectives within a practical timeframe, including protein-ligand complementarity. Generated de novo molecules contained both known and promising adenosine A2A receptor ligand chemistry that is not available in commercial vendor libraries, accessing commercially novel areas of chemical space. Experimental validation identified three nanomolar ligands with confirmed functional activity, two of which contain novel chemotypes. Overall, demonstrating a binding hit rate of 88% with 50% of the binders demonstrating confirmed functional activity emphasising the complex relationships in translating binding to downstream pharmacology. Lastly, the two strongest binders were co-crystallised with the A2A receptor revealing their binding mechanisms that can be used to inform future iterations of structure-guided de novo design, closing the AI SBDD loop.
Supplementary materials
Title
Supporting information
Description
Supporting information containing supplementary methods and figures.
Actions
Supplementary weblinks
Title
SMILES-RNN
Description
GitHub link containing the SMILES-RNN code used to generated the results presented in the manuscript.
Actions
View