Directional Multiobjective Optimization of Metal Complexes at the Billion-Scale with the tmQMg-L Dataset and PL-MOGA Algorithm

Hannes Kneiding; Ainara Nova; David Balcells

doi:10.26434/chemrxiv-2023-k3tf2-v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Directional Multiobjective Optimization of Metal Complexes at the Billion-Scale with the tmQMg-L Dataset and PL-MOGA Algorithm

25 September 2023, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Transition metal complexes (TMCs) play a key role in several areas of high interest, including medicinal chemistry, renewable energies, and nanoporous materials. The development of TMCs enabling these technologies remains challenged by the need to optimize multiple properties within very large chemical spaces, in which the thirty transition metals can be combined with a virtually infinite number of ligands. In this work, we provide the open tmQMg-L dataset including 30K TMC ligands, which combines large chemical diversity with synthesizability. The charge and metal-coordination mode of the ligands were robustly defined with a novel algorithm based on graph and natural bond orbital theories. The tmQMg-L dataset was leveraged in the automated generation of 1.37M TMCs resulting from all possible combinations between a square planar palladium(II) scaffold and a pool of 50 different ligands. This TMC space was used to benchmark a multiobjective genetic algorithm (MOGA) that optimized two properties over a Pareto front; namely the polarizability (alpha) and the HOMO-LUMO gap (epsilon). The MOGA evolved 130 TMC hits with maximal (alpha, epsilon) values in a way that could be easily rationalized by analyzing the nature of the ligands selected. Instead of the traditional mutation and crossover of fragments within a single ligand, this MOGA implemented full-ligand genetic operations acting on all coordination sites, maximizing chemical diversity. Further, we extended this MOGA algorithm with the Pareto-Lighthouse functionality (PL-MOGA), which allows for controlling both the aim and scope of the multiobjective optimization over the Pareto front. In explicit spaces containing billions of TMCs, the PL-MOGA enabled the explainable generation of thousands of novel and highly diverse TMC hits. We believe that the combined use of the tmQMg-L dataset and PL-MOGA algorithm will facilitate the discovery of TMCs with optimal properties within untapped chemical spaces.

Keywords

evolutionary learning

multiobjective optimization

transition metal complex

chemical diversity

directional optimization

Supplementary materials

Title

Description

Actions

Title

Supporting Information

Description

The Supporting Information provides further details about the tmQMg-L dataset, the 1.37M chemical space, the PL-MOGA algorithm, the estimation of the chemical diversity with average Tanimoto coefficients, the DFT benchmark, repetitions from different random initial populations, additional information on the exploration of the implicit billion spaces, and general computational and chemoinformatics details.

Actions

Supplementary weblinks

Title

Description

Actions

Title

tmQMg-L dataset

Description

GitHub page of the tmQMg-L dataset

Actions

View

Title

PL-MOGA code

Description

GitHub page of the PL-MOGA code

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

Directional multiobjective optimization of metal complexes at the billion-system scale

Hannes Kneiding, Ainara Nova, David Balcells journal article

Nature Computational Science , Volume 4, Issue 4

Online publication date: Mar 29, 2024

Version History

Sep 25, 2023 Version 2

Jun 21, 2023 Version 1

Version Notes

We added the Pareto-Lighthouse (PL) functionality to the original MOGA algorithm. The resulting PL-MOGA code allows for directional multiobjective optimization over the Pareto front, with fine control over both the aim and scope of the optimizer. Further, the PL-MOGA was applied to the exploration of vast chemical spaces containing billions of transition metal complexes.

Metrics

2,154

1,447

Views

Downloads

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2023-k3tf2-v2

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Directional Multiobjective Optimization of Metal Complexes at the Billion-Scale with the tmQMg-L Dataset and PL-MOGA Algorithm

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Now Published

Version History

Version Notes

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share