CrysText: A Generative AI Approach for Text-Conditioned Crystal Structure Generation using LLM

10 December 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Generating crystal structures directly from textual descriptions marks a pivotal advancement in materials informatics, offering a streamlined pathway from concept to discovery. Integrating generative models into Crystal Structure Prediction (CSP) presents a transformative opportunity to enhance efficiency and innovation. While large language models (LLMs) excel at understanding and generating text, their potential in materials discovery remains largely unexplored. Here, we introduce CrysText, an advanced approach for generating crystal structures from simple text prompts, conditioned on material composition and space group number. Leveraging Llama-3.1- 8B fine-tuned with Quantized Low-Rank Adaptation (QLoRA), our approach enables the efficient and scalable generation of CIF-formatted structures directly from input descriptions, eliminating the need for post-processing and ensuring effective fine-tuning with rapid inference. Evaluations on the MP-20 benchmark dataset demonstrate high structure match rates and effective RMSE metrics, showcasing the framework's ability to generate crystal structures that faithfully adhere to specified compositions and crystal symmetries. By conditioning on energy above the hull, we further demonstrate the potential of CrysText to generate stable crystal structures. Our work highlights the transformative role of LLMs in text-prompted inverse design, accelerating the discovery of new materials.

Keywords

Crystal Structure Prediction (CSP)
Large Language Models (LLMs)
Quantized Low- Rank Adaptation (QLoRA)

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.