Open-Source Chromatographic Data Analysis for Reaction Optimization and Screening

05 September 2022, Version 1

Abstract

Automation and digitalization solutions in the field of small molecule synthesis face new challenges for chemical reaction analysis, especially in the field of high-performance liquid chromatography (HPLC). Chromatographic data remains locked in vendors’ hardware and software components limiting their potential in automated workflows and contradicting to FAIR data principles (findability, accessibility, interoperability, reuse), which enable chemometrics and data science applications. In this work, we present an open-source Python project called MOCCA (Multivariate Online Contextual Chromatographic Analysis) for the analysis of open-format HPLC–DAD (photodiode array detector) raw data. MOCCA provides a comprehensive set of data analysis features including a peak deconvolution routine which allows for automated deconvolution of known signals even if overlapped with signals of unexpected impurities or side products. We highlight the broad applicability of MOCCA in four studies: (i) a simulation study to validate MOCCA’s data analysis features; (ii) a reaction kinetics study on a Knoevenagel condensation reaction demonstrating MOCCA’s peak deconvolution feature; (iii) a closed-loop optimization study for the alkylation of 2-pyridone highlighting MOCCA’s potential to obviate the need for human control during data analysis; (iv) a well plate screening of categorical reaction parameters for a novel palladium-catalyzed cyanation of aryl halides employing O-protected cyanohydrins where MOCCA tracks all known and unknown signals. These studies emphasize how MOCCA enables its users to make data-based decisions in synthesis workflows with different degrees of automation by providing actionable analytics. By publishing MOCCA as a Python package together with this work, we envision an open-source community project for chromatographic data analysis with the potential of further advancing its scope and capabilities

Keywords

high-performance liquid chromatography (HPLC)
Python
peak deconvolution
reaction kinetics
closed-loop optimization
palladium-catalyzed cyanation of aryl halides

Supplementary materials

Title
Description
Actions
Title
Supplementary Information
Description
Additional details to all presented case studies, description how to extract HPLC–DAD raw data from vendor control software of major vendors, technical details to MOCCA’s data analysis features, NMR spectra of O-protected cyanohydrins.
Actions
Title
Examples of MOCCA reports in html format
Description
MOCCA reports for the data analysis of the well plate screening (cyanation of aryl halides).
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.