ZINC-22 - A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery

Benjamin Tingle; Khanh Tang; Jose Castanon; John Gutierrez; Munkhzul Khurelbaatar; Chinzorig Dandarchuluun; Yurii Moroz; John Irwin

doi:10.26434/chemrxiv-2022-82czl

Biological and Medicinal Chemistry

Search within Biological and Medicinal Chemistry

ZINC-22 - A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery

20 October 2022, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Purchasable chemical space has grown rapidly into the tens of billions of molecules providing unprecedented opportunities for ligand discovery, but also straining the tools that might exploit these molecules at scale. We have therefore developed ZINC-22, a database of commercially accessible small molecules derived from multi-billion-scale make-on-demand libraries. The new database and tools enable analog searching in this vast new space via a facile GUI, CartBlanche, drawing on similarity methods that scale sub-linearly in the number of molecules. The new library also uses data organization methods enabling rapid lookup of molecules and their physical properties, including conformations, partial atomic charges, cLogP values, and solvation energies, all crucial for molecule docking, which had become slow with older database organizations in previous versions of ZINC. As the libraries have continued to grow, we have been interested if molecular diversity has suffered, for instance, because certain scaffolds have come to dominate via easy analoging. This has not occurred thus far, and chemical diversity continues to grow with database size, with a log increase in Bemis-Murcko scaffolds for every two logs increase in database size. Most new scaffolds come from compounds with the highest heavy atom count. Finally, we consider the implications for databases like ZINC as the libraries grow towards and beyond the trillion-molecule range. ZINC is freely available to everyone and may be accessed at cartblanche22.docking.org, via Globus, and in the Amazon AWS and Oracle OCI clouds.

Keywords

molecular docking

ligand discovery

Supplementary materials

Title

Description

Actions

Title

Supporting information for ZINC-22 paper

Description

S0. Access to databases to prevent molecules becoming unpatentable S1. Source catalog contributions to ZINC-22 S2. Sharding script S3. ZINC-22 numbering S4. Software and Hardware overview S5. Sn system overview S6. Sb system overview S7. Common Database Schema Overview S8. Important management scripts for ZINC-22

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Oct 20, 2022 Version 1

Metrics

3,811

2,809

Views

Downloads

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2022-82czl

Funding

National Institutes of Health

GM133836

Author’s competing interest statement

JJI is a founder of Blue Dolphin Lead Discovery, LLC, a contract research organization focused on molecular docking. JJI is also a co-founder of Deep Apple Therapeutics. YSM is employed with Chemspace LLC, a marketplace with a billion-size catalog of chemical and biological products and a provider of discovery services. YSM also serves as a scientific advisor at Enamine Ltd.

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

ZINC-22 - A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery

Authors

Abstract

Keywords

Supplementary materials

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share