SynTemp: Efficient Extraction of Graph-Based Reaction Rules from Large-Scale Reaction Databases

30 September 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

SynTemp is a framework designed to extract and hierarchically cluster reaction templates from large-scale reaction data repositories. Reaction templates are partial Imaginary Transition State graphs representing the reaction center as well as surrounding context. These graphs are equivalent to Double Pushout graph rewriting rules and thus can be applied directly to predict reaction outcomes at structural formula level. Rule inference is based on a consensus of multiple atom-atom mapping (AAM) tools integrating predictions RXNMapper, GraphormerMapper, and LocalMapper based on a robust graph-theoretic methodology for comparing partial atom-atom mappings. SynTemp achieves an exceptional accuracy of 99.5% and a success rate of 71.23% in obtaining AAMs on the Chemical Reaction Dataset. Reaction centers with surrounding contexts are extracted and completed with mechanistically relevant hydrogen atoms to obtain complete reaction templates. Subsequently, they were categorized into distinct groups based on topological features using hierarchical clustering, resulting in a library of 311 transformation rules that explains 86% of the reaction data set. A residual of 14% remained unresolved due to non-equivalent AAMs and ambiguous hydrogen placements. Despite these challenges, the coverage of our templates remains high at approximately 93.5-94.5%, surpassing that of RDChiral using SMARTS templates.

Keywords

reaction rules
graph transformations
atom-atom maps
classification of reactions

Supplementary materials

Title
Description
Actions
Title
Supplementary 1
Description
Additional Figures and Tables
Actions
Title
Supplementary 2
Description
Reactions with reaction step detected by topological descriptors
Actions
Title
Supplementary 3
Description
Reaction rule library
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.