Abstract
Materials informatics uses data-driven approaches for the study and discovery of materials. Features or descriptors are the crucial components in generating reliable and accurate machine-learning models. While general data can be acquired through public and commercial sources, features must be tailored for a specific application. Common featurizers are suitable for generic chemical problems, but may not be ideal for solid state materials. Here, we have assembled the Oliynyk property list for feature generation which works well on limited datasets (50 to 1,000 training data points) in solid state materials domain. We applied Gaussian process regression to extrapolate and predict missing values in the element property list. Complete data in the feature list allows researchers to use any methods that require a complete x-block without any data gaps, such as SVM. To validate our updated property list and generated features based on this list, a classical crystallographic problem of classifying structure type was solved. Similarly to the radius ratio rule (Linus Pauling, 1929) and structure maps (Villars, 1983; Pettifor 1984), we demonstrate how 1:3 stoichiometry structure types could be classified with SVM method and the x-block based on the proposed list of properties. We validate the ML model experimentally by synthesizing a novel intermetallic UCd3.
Supplementary materials
Title
Supporting information
Description
A document containing GPR prediction results of 6 electronegativity scales, diffraction pattern and fit of samples, SVM predicted crystal structures.
Actions
Title
Elemental property list
Description
A curated database containing physical properties of elements from 1 to 86 atomic numbers.
Actions