Abstract
Metal-organic frameworks (MOFs) are a diverse class of porous materials composed of inorganic nodes joined by organic linkers, currently under investigation for a wide range of applications including gas storage and separation where they have been commercialized. Given the labor-intensive nature of synthesizing and testing individual MOFs, high-throughput computational screening and machine learning (ML) methods are increasingly viewed as essential for facilitating MOF development. However, the structural fidelity of the “computation-ready” MOF databases used in such studies remains largely unquantified. We introduce MOSAEC, an algorithm that detects chemically invalid structures on the basis of metal oxidation states. MOSAEC was manually validated against ~16k MOF structures from the popular CoRE database, and was found to flag erroneous structures with 95% accuracy. Systematic examination of 14 leading experimental and hypothetical MOF databases containing >1.9 million MOFs reveals concerning structural error rates, exceeding 40% in most cases.
Supplementary materials
Title
Details of Validation
Description
• Details of the MOSAEC algorithm and validation are provided.
• The manual validation sets used to check oxidation state accuracy, error sensitivity, and error flag accuracy.
• A complete list of structures flagged by MOSAEC as being problematic in each databased screened in this work.
Actions