Data-Driven Prediction of Enantioselectivity for the Sharpless Asymmetric Dihydroxylation: Model Development and Experimental Validation

Blake Ocampo; Bilal Altundas; Matthew Bock; Sara Feiz; Scott Denmark

doi:10.26434/chemrxiv-2025-zp7rn

Organic Chemistry

Search within Organic Chemistry

Data-Driven Prediction of Enantioselectivity for the Sharpless Asymmetric Dihydroxylation: Model Development and Experimental Validation

24 April 2025, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The Sharpless asymmetric dihydroxylation remains a key transformation in chemical synthesis, yet its success hides unexpected cases of lower selectivity. A chemoinformatic workflow was developed to allow data-driven analysis of the reaction. A database of 1007 reactions employing AD-mix α and β was curated from the literature, and an alignment-dependent, fragment-based featurization of alkenes was implemented for modeling. This platform converged on machine learning models capable of predicting the magnitude of enantioselectivity for multiple alkene classes, achieving Q2F3 values ≥ 0.8, test r2 values ≥ 0.7 and mean absolute errors (MAE) ≤ 0.3 kcal/mol. The features of alkenes contributing to model performance were assessed with SHapley Additive exPlanations (SHAP) analysis to gather insight into factors underlying predictions. Experimental validation demonstrated that the models could achieve meaningful predictions on numerous out-of-sample alkenes.

Keywords

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Apr 24, 2025 Version 1

Metrics

188

124

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2025-zp7rn

Funding

National Science Foundation

NSF CHE 2154237

National Science Foundation

NSF CHE 2019897

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Data-Driven Prediction of Enantioselectivity for the Sharpless Asymmetric Dihydroxylation: Model Development and Experimental Validation

Authors

Abstract

Keywords

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share