Deep generative model of constructing chemical latent space for large molecular structures with 3D complexity

Toshiki Ochiai; Tensei Inukai; Manato Akiyama; Kairi Furui; Masahito Ohue; Nobuaki Matsumori; Shinsuke Inuki; Motonari Uesugi; Toshiaki Sunazuka; Kazuya Kikuchi; Hideaki Kakeya; Yasubumi Sakakibara

doi:10.26434/chemrxiv-2023-pjl0w-v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Deep generative model of constructing chemical latent space for large molecular structures with 3D complexity

29 May 2023, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The structural diversity of chemical libraries, which are systematic collections of compounds that have potential to bind to biomolecules, can be represented by chemical latent space. A chemical latent space is a projection of a compound structure into a mathematical space based on several molecular features, and it can express structural diversity within a compound library in order to explore a broader chemical space and generate novel compound structures for drug candidates. Many of the natural compounds produced by living organisms have complicated structures and are highly biologically active. However, no deep learning models exist that can effectively construct chemical latent spaces to handle large and complex compound structures, such as those found in natural compounds, and furthermore manage chirality, which is an essential factor in the 3D complexity of compounds. In this study, we developed a new deep-learning method, called NP-VAE, based on variational autoencoder for handling natural compounds, and constructed a chemical latent space that projected large and complex compound structures including chirality. NP-VAE was successful in construction of the chemical latent space that showed higher accuracy with respect to reconstruction and generalization than the state-of-the-art deep learning methods. Furthermore, by exploring the acquired latent space, we succeeded in comprehensively analyzing a compound library containing natural compounds and generating novel compound structures with optimized functions.

Keywords

chemical latent space

generative model

variational autoencoder (VAE)

Supplementary materials

Title

Description

Actions

Title

Supplemental Method, Tables and Figures

Description

Supplemental Method: NP-VAE algorithm. Supplemental Table S1. Supplemental Figures S1-S6

Actions

Supplementary weblinks

Title

Description

Actions

Title

NP-VAE

Description

The source code for the implementation of NP-VAE and its datasets.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

May 29, 2023 Version 2

May 22, 2023 Version 1

Version Notes

Title and Introduction have been changed. Several typos have also been corrected.

Metrics

1,956

911

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2023-pjl0w-v2

Funding

Ministry of Education, Culture, Sports, Science and Technology of Japan

22H04901

Ministry of Education, Culture, Sports, Science and Technology, Japan

23H04885

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Deep generative model of constructing chemical latent space for large molecular structures with 3D complexity

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share