DigiMOF: A Database of MOF Synthesis Information Generated via Text Mining

Kristian  Gubsch; Rosalee  Bence; Lawson  Glasby; Peyman Z. Moghadam

doi:10.26434/chemrxiv-2022-41t70

Materials Chemistry

Search within Materials Chemistry

DigiMOF: A Database of MOF Synthesis Information Generated via Text Mining

18 May 2022, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The vastness of materials space, particularly that which is concerned with metal-organic frameworks (MOFs), creates the critical problem of performing efficient identification of promising materials for specific applications. Although high-throughput computational approaches, including the use of machine learning, have been useful in rapid screening and rational design of MOFs, they tend to neglect descriptors related to their synthesis. One way to improve the efficiency of MOF discovery is to data mine published MOF papers to extract the materials informatics knowledge contained within the journal articles. Here, by adapting the chemistry-aware natural language processing tool, ChemDataExtractor (CDE), we generated an open-source database of MOFs focused on their synthetic properties: the DigiMOF database. Using the CDE web scraping package alongside the Cambridge Structural Database (CSD) MOF subset, we automatically downloaded 43,281 unique MOF journal articles, extracted 15,501 unique MOF materials and text mined over 52,680 associated properties including synthesis method, solvent, organic linker, metal precursor, and topology. This centralised, structured database reveals the MOF synthetic data embedded within thousands of MOF publications. The DigiMOF database and associated software are publicly available for other researchers to conduct further analysis of alternative MOF production pathways and create additional parsers to search for other desirable properties.

Keywords

MOFs

text mining

synthesis data

digital manufacturing

database

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

DigiMOF: A Database of Metal–Organic Framework Synthesis Information Generated via Text Mining

Lawson T. Glasby, Kristian Gubsch, Rosalee Bence, Rama Oktavian, Kesler Isoko, Seyed Mohamad Moosavi, Joan L. Cordiner, Jason C. Cole, Peyman Z. Moghadam journal article

Chemistry of Materials , Volume 35, Issue 11

Online publication date: May 18, 2023

Version History

May 18, 2022 Version 1

Metrics

2,498

944

Views

Downloads

License

The content is available under CC BY NC 4.0

DOI

10.26434/chemrxiv-2022-41t70

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

DigiMOF: A Database of MOF Synthesis Information Generated via Text Mining

Authors

Abstract

Keywords

Comments

Now Published

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share