Generative Design of Functional Metal Complexes Utilizing the Internal Knowledge of Large Language Models

24 October 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The design of functional transition metal complexes (TMCs) is hindered by the combinatorial explosion of the search space spanned by various metals and ligands, necessitating efficient multi- objective optimization strategies. Traditional genetic algorithms (GAs) are frequently employed in this domain, utilizing random mutations and crossovers steered by explicit mathematical objective formulations to navigate the search space. The transfer and sharing of knowledge across different GA optimization tasks, however, remain challenging. Here, we introduce the integration of large language models (LLMs) into the evolutionary optimization framework (LLM-EO) for TMCs. LLM- EO significantly outperforms traditional GAs due to the intrinsic chemical knowledge embedded within LLMs, acquired during their extensive pretraining. Notably, without the need for supervised fine-tuning, LLMs can leverage the entirety of historical data amassed during the optimization processes, demonstrating superior performance compared to LLMs that are limited to the best TMCs identified in the evolutionary cycle. Specifically, LLM-EO identifies eight out of the top 20 TMCs with the largest HOMO-LUMO gaps by interrogating merely 200 candidates within a vast search space of 1.37 million TMCs. Through prompt engineering using natural language, LLM-EO introduces unparalleled flexibility in multi-objective optimizations, especially when guided by seasoned researchers, thereby circumventing the necessity for intricate mathematical formulations. As generative models, LLMs possess the capability to propose novel ligands and TMCs with unique chemical properties by amalgamating both internal knowledge and external chemistry data, thus combining the benefits of efficient optimization and molecular generation. With the increasing potential of LLMs, both in their capacity as pretrained foundational models and new strategies in post-training inference, we anticipate broad applications of LLM-based evolutionary optimization in the fields of chemistry and materials design.

Keywords

Large language models
evolutionary optimization
Transition metal complex

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.