Abstract
The machine learning (ML) method has emerged as an efficient surrogate for high-level electronic structure theory, offering precision and computational efficiency. However, the construction of a general force field remains challenging due to the vast conformational and chemical space. Training data sets typically cover only a limited region of this space, resulting in poor extrapolation performance. Traditional strategies inadequately address this problem by training models from scratch using both old and new datasets. In addition, model transferability is crucial for general force field construction. Existing ML force fields, designed for closed systems with no external environmental potential, exhibit limited transferability to complex condensed phase systems such as enzymatic reactions, resulting in inferior performance and high memory costs. Our ML/MM model based on the Taylor expansion of the electrostatic operator showed high transferability between reactions in several simple solvents. In this work, we extend the strategy to enzymatic reactions to explore transferability between more complex heterogeneous environments. In addition, we also apply continual learning strategies based on memory datasets to enable autonomous and on-the-fly training on a continuous stream of new data. By combining these two methods, we can construct a more general force field more efficiently.
Supplementary materials
Title
Supplemental Information
Description
Tables S1-S5 and Figures S1-S6
Actions