Abstract
We evaluate the effectiveness of pre-trained and fine-tuned large language models (LLMs) for predicting the synthesizability of inorganic compounds and the selection of precursors needed to perform inorganic synthesis. The predictions of fine-tuned LLMs are comparable to—and sometimes better than—recent bespoke machine learning models for these tasks, but require only minimal user expertise, cost, and time to develop. Therefore, this strategy can serve both as an effective and strong baseline for future machine learning studies of various chemical applications and as a practical tool for experimental chemists.
Supplementary materials
Title
Supporting Information
Description
Description of data preparation. Plots of the distribution of number of unique reactions and number of precursors. Description of model construction and training. LLM prompts. Description for evaluation metrics. Tables of the model performance for the synthesizability task. Description of methods and results for re-evaluating top-5 predictions using GPT-4 and code for associated statistical tests. Description of PU learning prompt modification experiments and table of results. Histogram of top-10 precursors occurrences. (PDF)
Actions