Deep Reinforcement Learning for Multiparameter Optimization in de novo Drug Design

30 April 2019, Version 2
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

In medicinal chemistry programs it is key to design and make compounds that are efficacious and safe. This is a long, complex and difficult multi-parameter optimization process, often including several properties with orthogonal trends. New methods for the automated design of compounds against profiles of multiple properties are thus of great value. Here we present a fragment-based reinforcement learning approach based on an actor-critic model, for the generation of novel molecules with optimal properties. The actor and the critic are both modelled with bidirectional long short-term memory (LSTM) networks. The AI method learns how to generate new compounds with desired properties by starting from an initial set of lead molecules and then improve these by replacing some of their fragments. A balanced binary tree based on the similarity of fragments is used in the generative process to bias the output towards structurally similar molecules. The method is demonstrated by a case study showing that 93% of the generated molecules are chemically valid, and a third satisfy the targeted objectives, while there were none in the initial set.

Keywords

Reinforcement learning
Deep Learning
Multi parameter optimization
Fragment based drug discovery
De novo drug discovery

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.