TRACER: Molecular Optimization Using Conditional Transformer for Reaction-Aware Compound Exploration with Reinforcement Learning

03 June 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Designing molecules with desirable properties is a critical endeavor in drug discovery. Because of recent advances in deep learning, molecular generative models have been developed. However, existing compound exploration models often disregard the important issue of ensuring the feasibility of organic synthesis. To address this issue, we propose TRACER (molecular optimization using a conditional Transformer for reaction-aware compound exploration with reinforcement learning); this is a framework that integrates the optimization of molecular properties with the generation of synthetic pathways. At the core of TRACER is a conditional Transformer model trained on a dataset of chemical reactions. The model can predict the product from a given reactant under the constraints of a reaction type specified by a graph convolutional network. The results of molecular optimization on an activity prediction model targeting the dopamine receptor D2 showed that TRACER effectively generated compounds exhibiting high scores. The Transformer model, which recognizes the entire structure, captures the complexity of the organic synthesis and enables its navigation in the vast chemical space, with consideration of the real-world reactivity constraints. The source code of TRACER, the activity prediction model, and the curated dataset are available in our public repository at https://github.com/sekijima-lab/TRACER.

Supplementary materials

Title
Description
Actions
Title
Supporting Information for TRACER: Molecular Optimization Using Conditional Transformer for Reaction-Aware Compound Exploration with Reinforcement Learning
Description
This supporting information provides additional data for the TRACER study, including a graph for determining the optimal number of clusters, a detailed table of compound optimization results using MCTS with varying beam widths, substructure search results against active compounds, and synthetic routes for the top 5 compounds generated from each starting material.
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.