Abstract
Computationally screening chemical libraries to discover molecules with desired properties is a common technique used in early-stage drug discovery. Recent progress in the field now enables the efficient exploration of billions of molecules within days or hours, but this exploration remains confined within the boundaries of the accessible chemistry space. While the number of commercially available compounds grows rapidly, it remains a limited subset of molecules that could be synthesized. Here, we present a workflow where chemical reactions typically developed in academia and unconventional in drug discovery are exploited to dramatically expand the chemistry space accessible to virtual screening. We use this process to generate a first version of the Pan-Canadian Chemical Library, a collection of nearly 150 billion diverse compounds that does not overlap with other ultra-large libraries such as Enamine REAL or SAVI and could be a resource of choice for protein targets where other libraries have failed to deliver bioactive molecules. A 127 million compound subset of the library is available at https://pccl.thesgc.org/.