Abstract
Identifying high quality chemical starting points is a critical and challenging step in drug discovery, which typically involves screening large compound libraries or repurposing of compounds with known mechanisms of actions (MoAs). Here we introduce a novel cheminformatics approach that mines existing large-scale, phenotypic high throughput screening (HTS) data. Our method aims to identify bioactive compounds with distinct and specific MoAs, serving as a valuable complement to existing focused library collections. This approach identifies chemotypes with selectivity across multiple cell-based assays and characterized by persistent and broad structure activity relationships (SAR). We prospectively demonstrate the validity of the approach in broad cellular profiling assays (cell painting, DRUG-seq, Promotor Signature Profiling) and chemical proteomics experiments where the compounds behave similarly to known chemogenetic libraries, but with a bias towards novel protein targets and required no synthetic effort to improve compound properties. A public set of such compounds is provided based on the PubChem BioAssay dataset for use by the scientific community.
Supplementary weblinks
Title
Code for Grey Chemical Matter pipeline and PubChem GCM dataset
Description
A pipeline to identify bioactive small molecules with likely novel modes of actions and dynamic SAR from historic cell-HTS profiles, with an example application and hitlist from PubChem data
Actions
View