Predictive Minisci and P450 Late Stage Functionalization with Transfer Learning

21 December 2022, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Structural diversification of lead molecules is a key component of drug discovery to explore close-in chemical space. Late stage functionalizations (LSFs) are versatile methodologies capable of installing functional handles on richly decorated intermediates to deliver numerous diverse products in a single reaction. Predicting the regioselectivity of LSF is still an open challenge in the field. Numerous efforts from chemoinformatics and machine learning (ML) groups have made significant strides in this area. However, it is arduous to isolate and characterize the multitude of LSF products generated, limiting available data and hindering pure ML approaches. We report the development of an approach that combines message-passing neural network and an 13C NMR-based transfer learning to predict the atom-wise probabilities of functionalization. We validated our model retrospectively and with a series of prospective experiments, showing that it accurately predicts the outcomes of Minisci-type and P450 transformations, outperforming state-of-the-art Fukui-based reactivity indices.

Keywords

Late Stage Functionalization
Minisci
P450
Machine Learning
Regiochemistry Prediction
LSF

Supplementary materials

Title
Description
Actions
Title
SI
Description
SI
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.