Abstract
The polarisable machine-learned force field FFLUX requires pre-trained anisotropic Gaussian process regression (GPR) models of atomic energies and multipole moments to propagate unbiased molecular dynamics simulations. The outcome of FFLUX simulations is highly dependent on the predictive accuracy of the underlying models whose training entails determining the optimal set of model hyperparameters. Unfortunately, traditional direct learning (DL) procedures do not scale well on this task, especially when the hyperparameter search is initiated from a (set of) random guess solution(s). Additionally, the complexity of the hyperparameter space (HS) increases with the number of geometrical input features, at least for anisotropic kernels, making the optimization of hyperparameters even more challenging. In this study, we propose a transfer learning (TL) protocol that accelerates the training process of anisotropic GPR models by facilitating access to promising regions of the HS. The protocol is based on a seeding-relaxation mechanism in which an excellent guess solution is identified by rapidly building one or several small source models over a subset of the target training set before readjusting the previous guess over the entire set. We demonstrate the performance of this protocol by building and assessing the performance of DL and TL models of atomic energies and charges in various conformations of benzene, ethanol, formic acid dimer and the drug fomepizole. Our experiments suggest that TL models can be built one order of magnitude faster while preserving the quality of their DL analogs. Most importantly, when deployed in FFLUX simulations, TL models compete or even outperform their DL analogs when it comes to performing FFLUX geometry optimization and computing harmonic vibrational modes.
Supplementary materials
Title
Transfer learning of hyperparameters for fast construcƟon of anisotropic GPR models: design and applicaƟon to the machine‐learned force field FFLUX
Description
Table of contents
1. MulƟsource (MS) TL of hyperparameters ................................................................................................ 2
2. Data cleaning ............................................................................................................................................. 4
3. Why choose the IHCOV approach with an anisotropic kernel? ............................................................... 5
4. Molecular learning (S‐)curves of DL models ............................................................................................ 7
5. Training Ɵmings of DL models .................................................................................................................. 9
6. Performance and speedup factors of baseline and FS‐TL models ......................................................... 10
7. Performance and speedup factors of relaxed TL models ...................................................................... 11
8. PredicƟon of vibraƟonal normal modes ................................................................................................ 12
9. Electronic energy ranges within the set of opƟmizaƟon starƟng geometries ...................................... 16
Actions