Abstract
The structural diversity of chemical libraries, which are systematic collections of compounds that have potential to bind to biomolecules, can be represented by chemical latent space. A chemical latent space is a projection of a compound structure into a mathematical space based on several molecular features, and it can express structural diversity within a compound library in order to explore a broader chemical space and generate novel compound structures for drug candidates. Many of the natural compounds produced by living organisms have complicated structures and are highly biologically active. However, no deep learning models exist that can effectively construct chemical latent spaces to handle large and complex compound structures, such as those found in natural compounds, and furthermore manage chirality, which is an essential factor in the 3D complexity of compounds. In this study, we developed a new deep-learning method, called NP-VAE, based on variational autoencoder for handling natural compounds, and constructed a chemical latent space that projected large and complex compound structures including chirality. NP-VAE was successful in construction of the chemical latent space that showed higher accuracy with respect to reconstruction and generalization than the state-of-the-art deep learning methods. Furthermore, by exploring the acquired latent space, we succeeded in comprehensively analyzing a compound library containing natural compounds and generating novel compound structures with optimized functions.
Supplementary materials
Title
Supplemental Method, Tables and Figures
Description
Supplemental Method: NP-VAE algorithm.
Supplemental Table S1.
Supplemental Figures S1-S6
Actions
Supplementary weblinks
Title
NP-VAE
Description
The source code for the implementation of NP-VAE and its datasets.
Actions
View