Abstract
Classical molecular dynamics (MD) simulations represent a very popular and powerful tool for materials modeling
and design. The predictive power of MD hinges on the ability of the interatomic potential to capture the underlying
physics and chemistry. There have been decades of seminal work on developing interatomic potentials albeit with
a focus predominantly on capturing the properties of bulk materials. Such physics-based models, while extensively
deployed for predicting dynamics and properties of nanoscale systems over the past two decades, tend to perform poorly
in predicting nanoscale potential energy surface when compared to high-fidelity first-principles calculations. These
limitations stem from the lack of flexibility in such models, which rely on a pre-defined functional form. Machine
learning models and approaches have emerged as a viable alternative to capture the diverse size-dependent cluster
geometries, nanoscale dynamics and the complex nanoscale potential energy surfaces (PES), without sacrificing the
bulk properties. Here, we introduce an ML workflow that combines transfer and active learning strategies to develop
high-dimensional neural networks (NN) for capturing cluster and bulk properties for several different transition metals
with applications in catalysis, microelectronics, and energy storage to name a few. Our NN first learns the bulk PES
from the high-quality physics-based models in literature and subsequently augments this learning via retraining with a
higher fidelity first-principles training dataset to concurrently capture both the nanoscale and bulk PES. Our workflow
departs from status-quo in its ability to learn from a sparsely sampled dataset that nonetheless covers a diverse range
of cluster configurations from near-equilibrium to highly non-equilibrium as well as learning strategies that iteratively
improves the fingerprinting depending on model fidelity. All the developed models are rigorously tested against an
extensive first-principles dataset of energies and forces of cluster configurations as well as several properties of bulk
configurations for 10 different transition metals. Our approach is material agnostic and provides a methodology to
transfer and build upon the learnings from decades of seminal work in molecular simulations on to a new generation of
ML trained potentials to accelerate materials discovery and design.