Abstract
Designing compounds with a range of desirable properties is a fundamental challenge in drug discovery. In
pre-clinical early drug discovery, novel compounds are often designed based on an already existing promising
starting compound through structural modifications for further property optimization. Recently,
transformer-based deep learning models have been explored for the task of molecular optimization by training
on pairs of similar molecules. This provides a starting point for generating similar molecules to a given input
molecule, but has limited flexibility regarding user-defined property profiles. Here, we evaluate the effect of
reinforcement learning on transformer-based molecular generative models. The generative model can be
considered as a pre-trained model with knowledge of the chemical space close to an input compound, while
reinforcement learning can be viewed as a tuning phase, steering the model towards chemical space with
user-specific desirable properties. The evaluation of two distinct tasks - molecular optimization and scaffold
discovery - suggest that reinforcement learning could guide the transformer-based generative model towards
the generation of more compounds of interest. Additionally, the impact of pre-trained models, learning steps
and learning rates are investigated.
Scientific Contribution:
Our study investigates the effect of reinforcement learning on a transformer-based generative model initially
trained for generating molecules similar to starting molecules. The reinforcement learning framework is applied
to facilitate multiparameter optimisation of starting molecules. This approach allows for more flexibility for
optimizing user-specific property profiles and helps finding more ideas of interest.
Supplementary materials
Title
Supplementary figures
Description
This file contains supplementary figures to the main manuscript.
Actions