Abstract
Finding materials with good performance in a specific application, especially when the origin of good performance is not well understood or not easily computable, is a major challenge in materials science. Trial-and-error random exploration is prohibitively expensive due to the vastness of the materials space. A more practical approach is to search for new materials within the proximity of known compounds that possess the desired property. In such an approach, assessing materials’ similarity requires deriving some fingerprint relevant for material’s performance. Typically, material’s structure is used as the fingerprint, which often does not translate into similarity in properties. Electronic structure fingerprints, e.g., density of states (DOS) or electronic band structure, were proposed as a better alternative, however, the computational cost of their calculation on the scale of 100,000 materials remains too high for rapid exploration. In this work, we developed a Graph Convolutional Network (GCN) ProDosNet which is trained on orbital-resolved and atom-resolved projected density of states (PDOS) data and is capable of predicting the electronic structure of materials at extremely low computational cost. With this model, we were able to generate PDOS fingerprints for all compounds present in the Materials Projects database and cluster them by similarity of their orbital-resolved PDOS. We demonstrate that these electronic fingerprints allow finding materials with similar electronic properties but drastically different structures for applications in photovoltaics, catalysis, and batteries.