Kinetic predictions for SN2 reactions using the BERT architecture: Comparison and interpretation.

07 April 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The accurate prediction of reaction rates is an integral step in elucidating reaction mechanisms and designing synthetic pathways. Traditionally, kinetic parameters have been derived from activation energies obtained from quantum mechanical (QM) methods and, more recently, machine learning (ML) approaches. Among ML methods, Bidirectional Encoder Representations from Transformers (BERT), a type of transformer-based model, is the state-of-the-art method for both reaction classification and yield prediction. Despite its success, it has yet to be applied to kinetic prediction. In this work, we train a BERT model to predict experimental logk values of SN2 reactions and compare its performance to the top-performing Random Forest (RF) literature model in terms of accuracy, training time, and ability to replicate known reactivity rules. Both BERT and RF models exhibit near-experimental accuracy (RMSE = 1.1 logk units) on similarity-split test data. Interpretation of the predictions from both BERT and RF reveal that both models identify key reaction centers, as well as known electronic and steric effects. However, limitations in logk extrapolation and recognition of aromatic effects are found for the RF and BERT models, respectively.

Keywords

Random Forest
BERT
Reaction kinetics
Rate prediction
SN2
Interpretable ML

Supplementary materials

Title
Description
Actions
Title
Supporting Information.
Description
Supporting information for the work titled Kinetic predictions for SN2 reactions using the BERT architecture: Comparison and interpretation.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.