(Semi-) Automatic Review Process for Common Compound Characterization Data in Organic Synthesis

Yu-Chieh Huang; Pierre Tremouilhac; Stefan Kuhn; Pei-Chi Huang; Chia-Lin Lin; Nils Schlörer; Oskar Taubert; Markus Götz; Nicole Jung; Stefan Bräse

doi:10.26434/chemrxiv-2024-1r9tb

Analytical Chemistry

Search within Analytical Chemistry

(Semi-) Automatic Review Process for Common Compound Characterization Data in Organic Synthesis

28 February 2024, Version 1

This is not the most recent version. There is a

newer version

of this content available

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

A method for data review in chemical sciences with a focus on data for the characterization of synthetic molecules is described. As current procedures for data curation in chemistry rely almost exclusively on manual checking or peer reviewing, a (semi-)automatic procedure for the evaluation of data assigned to molecular structures is proposed and demonstrated. The information usually required for the identification of isolated compounds is used to clarify whether the data is complete with respect to the available data types and metadata, if it is consistent with the proposed structure and if it is plausible in comparison to simulated data. Spectra prediction and automatic signal comparison are applied to NMR evaluation, mass spectrometry data are evaluated by signal extraction, and machine learning is used for IR analysis. The proposed protocol shows how an integration of different tools for data analysis can help to overcome the challenges of the currently purely manual reviewing and curation efforts for data in synthetic chemistry.

Keywords

data curation

repositories

electronic lab notebooks

chemistry data

analytics

Supplementary materials

Title

Description

Actions

Title

Supplemental information Part 1

Description

Supplemental material on technical details and review summary for 110 selected datasets

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Jan 02, 2025 Version 2

Feb 28, 2024 Version 1

Metrics

1,038

491

Views

Downloads

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2024-1r9tb

Funding

Deutsche Forschungsgemeinschaft

BR1750/34-1

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

(Semi-) Automatic Review Process for Common Compound Characterization Data in Organic Synthesis

Authors

Abstract

Keywords

Supplementary materials

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share