Multi-objective evolutionary strategy for improving semiempirical Hamiltonians in the study of enzymatic reactions at the QM/MM level of theory

13 February 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Quantum mechanics/molecular mechanics (QM/MM) simulations are crucial for understanding enzymatic reactions, but their accuracy depends heavily on the quantum-mechanical method used. Semiempirical methods offer computational efficiency but often struggle with accuracy in complex systems. This work presents a novel multi-objective evolutionary strategy for optimizing semiempirical Hamiltonians, specifically designed to enhance their performance in enzymatic QM/MM simulations while remaining broadly applicable to condensed-phase systems. Our methodology combines automated parameter optimization, targeting \textit{ab initio} or density functional theory (DFT)-reference potential energy surfaces, atomic charges, and gradients, with comprehensive validation through minimum free energy path (MFEP) calculations. To demonstrate its effectiveness, we applied our approach to improve the GFN2-xTB Hamiltonian using two enzymatic systems that involve hydride transfer reactions where the activation energy barrier is severely underestimated: Crotonyl-CoA carboxylase/reductase (CCR) and dihydrofolate reductase (DHFR). The optimized parameters showed significant improvements in reproducing potential and free energy surfaces, closely matching higher-level DFT calculations. Through an efficient two-stage optimization process, we first developed parameters for CCR using reaction path data, then refined these parameters for DHFR by incorporating a targeted set of additional training geometries. This strategic approach minimized the computational cost while achieving accurate descriptions of both systems, as validated through QM/MM simulations using the Adaptive String Method (ASM). Our method represents an efficient approach for optimizing semiempirical methods to study larger systems and longer timescales, with potential applications in enzymatic reaction mechanisms studies, drug design, and enzyme engineering.

Keywords

Semiempirical methods
QM/MM simulations
Multi-objective optimization
Parameter optimization
Hydride transfer
Enzyme catalysis

Supplementary materials

Title
Description
Actions
Title
Supporting Information
Description
The Supporting Information provides comprehensive details about our methodology, parameters, and results. The document contains a thorough explanation of our methodology, including the theoretical foundations of GFN2-xTB's energy terms and their dependency on element-specific parameters, details of the ASM setup for CCR and DHFR enzymatic systems, the dual-level correction scheme for PMF profiles, our GFN2-xTB re-parametrization strategy, and PCA methodology for analyzing QM/MM trajectories. It also includes training sets generation procedures and comprehensive tables of optimized parameters from our parametrization workflow. Two GitHub repositories containing code and data for reproducing our work are provided, both released under the MIT License and include detailed documentation for reproducing our work. Users must have Amber24 compiled with GFN2-xTB API support, Gaussian16 (only needed to reproduce our work, you could use any QM package readable by cclib library) for reference calculations, and a Python environment with the required dependencies.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.