Abstract
We present a data-efficient approach to train graph neural networks (GNNs) on density functional theory (DFT) data for accurate and transferable predictions of energetic and structural properties of refractory solid solution alloys in the niobium-tantalum-vanadium (Nb-Ta-V) chemical space. We start by training the GNN model only on DFT data that describes refractory binary alloys niobium-tantalum (Nb-Ta), niobium-vanadium (Nb-V), and tantalum-vanadium (Ta-V) to predict formation enthalpy and root mean squared displacement. Once trained, the GNN predictions are tested on DFT data describing refractory ternary alloys Nb-Ta-V. While, unsurprisingly, direct transferability from binary to ternary is not sufficiently accurate, augmenting the training with only 1% of the available ternary data (uniformly distributed across the entire range of chemical compositions) improves significantly the quality of the GNN predictions. For comparison, we assess the transferability in the opposite direction by training GNN models on ternary Nb-Ta-V data and making predictions on binaries Nb-Ta, Nb-V, and Ta-V, which exhibits notably higher predictive errors. The proposed methodology, which favors transferability from lower-component to higher-component alloys, offers an efficient path towards avoiding the curse of dimensionality incurred when collecting DFT data for discovery and design of multi-component disordered alloys.