Advancing Density Functional Tight-Binding method for Large Organic Molecules through Equivariant Neural Networks

24 April 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Semi-empirical electronic structure methods have become valuable tools for studying complex (bio)molecular systems due to their balance between computational efficiency and accuracy. A key aspect of these methods is their parameterization, which not only governs the reliability of the results but also provides an opportunity to enhance their overall performance. In our previous work [J. Phys. Chem. Lett. 11, 16 (2021)], we improved the accuracy of the semi-empirical density functional tight-binding (DFTB) method for computing multiple properties of small molecules by developing the machine learning (ML) potential NN$_{\rm rep}$ to bridge the gap between electronic DFTB components and those of the hybrid DFT-PBE0 functional. To overcome the limitations of NN$_{\rm rep}$, we introduce the EquiDTB framework, which leverages physics-inspired equivariant neural networks (NN) to parameterize scalable and transferable $\Delta_{\rm TB}$ many-body potentials, replacing the standard pairwise repulsive potential in the DFTB method. This advancement extends the applicability of our ML-corrected DFTB approach to larger molecules and non-covalent systems, going beyond the chemical space represented in the training quantum-mechanical datasets. Indeed, the enhanced performance of EquiDTB over the standard TB methods (DFTB and GFN2-xTB) is demonstrated by the accurate computation of the atomic forces of equilibrium and non-equilibrium small molecular dimers, as well as their interaction energies. Moreover, EquiDTB can be effectively employed to explore the potential energy surfaces of large and flexible drug-like molecules---for example, to determine the minimum energy path between isomers, analyze structural transitions during dynamical simulations, compute vibrational modes, and investigate energetic rankings. Therefore, our work highlights that an optimal integration of an equivariant NN with QM datasets can advance the DFTB method to achieve DFT-PBE0 level accuracy with high computational efficiency, paving the way for more reliable (bio)molecular simulations.

Keywords

machine learning
electronic structure
semi-empirical methods
molecular simulations
flexible molecules

Supplementary materials

Title
Description
Actions
Title
Supplementary Information
Description
Additional results to support our conclusions.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.