Abstract
The lead optimization process in drug discovery campaigns is an arduous endeavour where the input of many medicinal chemists is weighed in order to reach a desired molecular property profile. Building the expertise to successfully drive such projects collaboratively is a very time-consuming process that typically spans many years within a chemist's career. In this work we aim to replicate this process by applying artificial intelligence learning-to-rank techniques on feedback that was obtained from 35 chemists at Novartis over the course of several months. We exemplify the usefulness of the learned proxies in routine tasks such as compound prioritization, motif rationalization, and biased \textit{de novo} drug design. Annotated response data is provided, and developed models and code made available through a permissive open-source license.
Supplementary materials
Title
Supporting Information
Description
Additional results
Actions
Supplementary weblinks
Title
MolSkill GitHub repository
Description
Link to GitHub repository with production code, models and data.
Actions
View