Abstract
One of the fundamental limitations of accurately modeling biomolecules like DNA is the inability to perform quantum chemistry calculations on large molecular structures. We present a machine learning model based on an equivariant Euclidean Neural Network framework to obtain quantum-accurate electron densities for arbitrary DNA structures that are much too large for conventional quantum methods. The model is trained on representative B-DNA base pair steps that capture both base pairing and base stacking interactions. The model produces accurate electron densities for arbitrary B-DNA structures with typical errors of less than 1%. Crucially, the error does not increase with system size, which suggests that the model can extrapolate to large DNA structures with negligible loss of accuracy. The model also generalizes well to other DNA structural motifs such as the A- and Z-DNA forms, despite being trained on only B-DNA configurations. We show that this machine learning electron density model can be used to calculate electrostatic potentials of DNA with quantum accuracy. These electrostatic potentials produce more accurate results compared to classical force fields and do not show the usual deficiencies at short range. Lastly, the model is used to calculate electron densities of several large-scale DNA structures, and we show that the computational scaling for this model is linear.
Supplementary materials
Title
Electronic Supplementary Information: Predicting quantum-accurate electron densities for DNA with equivariant neural networks
Description
Neural network hyperparameters, and supplementary figures and tables for training and testing the DNA machine learning model.
Actions
Supplementary weblinks
Title
Generate and predict molecular electron densities with Euclidean Neural Networks
Description
Github repository containing machine learning (e3nn) code for reproducing experiments and analysis.
Actions
View