Abstract
DNA-encoded libraries (DELs) enable efficient experimental screening of vast combinatorial molecular libraries, making it a powerful platform for drug discovery. Apart from ensuring the druglike physicochemical properties, other key parameters to maximize the success rate of DEL designs include the scaffold diversity and target addressability. While several computational tools have been developed to evaluate DEL chemical diversity, a dedicated tool that combines both parameters is currently lacking. In this work, we developed a computational approach to systematically evaluate both the scaffold diversity and target-orientedness of DELs using Bemis-Murcko (BM) scaffold analysis and machine learning. To demonstrate the utility of this approach, we present a case study using two of our in-house produced libraries. We show that our workflow can effectively distinguish between a generalist and a focused library. Furthermore, we show that although focused libraries tend to have higher compound-based addressability, they could suffer from lower scaffold-based addressability relative to a generalist library. Consequently, we illustrate how our computational tool can guide medicinal chemists in deciding which library to screen as a function of the objective, whether it is hit-finding or hit-optimization. To facilitate utilization, this tool is freely available both as a web application and as a Python script at https://github.com/novalixofficial/NovaWebApp.
Supplementary materials
Title
Supporting Information
Description
Statistical analysis of cross validation scores
Actions