Learning Advance: Robotics-LLM Guided Hypotheses Generation for the Discovery of Chemical Knowledge

TianZhixi Yin; Ruozhu Feng; Jie Bao; Peiyuan Gao; Yangang Liang; Job Heather; Alan Aspuru-Guzik; Wei Wang

doi:10.26434/chemrxiv-2025-n1b4l

Materials Chemistry

Search within Materials Chemistry

Learning Advance: Robotics-LLM Guided Hypotheses Generation for the Discovery of Chemical Knowledge

02 April 2025, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

We present a novel framework that we name "Learning Advance" for hypothesis generation and validation for the discovery of chemical knowledge in the context of optimizing solubility in amphiphile/water systems. The workflow begins with an initial hypothesis: that the incorporation of common hydrotropic additives, such as sugars or urea, enhances solubility limits. To test this assumption, we employ a grid search and Latin hypercube sampling approach to design experimental combinations of additive weight percentages. We employ high-throughput robotic systems for automating the experiments and a YOLO-based image analysis workflow for determining the degree of solubilization. Experimental data are transformed into a chemical feature space to train a Gaussian Process Regression (GPR) model, which drives a Bayesian optimization (BO) algorithm for identifying optimal additive combinations. When BO plateaus, the "Learning Advance" approach leverages all accumulated data for AI analysis. We extract correlations between target property and chemical features, enabling LLM tools to generate a novel hypothesis based on the observed data. This hypothesis is subsequently validated through experimentation, creating a continuous cycle of discovery. This framework demonstrates how integrating BO with AI-driven hypothesis generation enables breakthroughs beyond conventional optimization limits, establishing a promising approach for advancing scientific knowledge discovery in material science and chemistry.

Supplementary materials

Title

Description

Actions

Title

Supplementary Materials

Description

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Apr 02, 2025 Version 1

Metrics

288

331

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2025-n1b4l

Funding

U.S. Department of Energy, Office of Science, Basic Energy Sciences

Energy Storage Research Alliance

Pacific Northwest National Laboratory

Energy Storage Materials Initiative

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Learning Advance: Robotics-LLM Guided Hypotheses Generation for the Discovery of Chemical Knowledge

Authors

Abstract

Supplementary materials

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share