Deep Learning Metal Complex Properties with Natural Quantum Graphs

28 June 2022, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Deep graph learning can make a strong contribution to accelerating the discovery of transition metal complexes (TMCs). TMCs will play a key role in the development of new technologies for which there is an urgent need; e.g. the production of green hydrogen from renewable sources. Despite the recent developments in machine learning for drug discovery and organic chemistry in general, the application of these methods to TMCs remains challenged by the higher complexity of these compounds, and the limited availability of the large and complex data required. In this work, we report a new graph representation for neural networks – the natural quantum graph (NatQG), which is based on natural bond orbitals (NBOs) and their second-order perturbation analysis (SOPA). The combined use of NBOs and SOPA was used to build two different types of NatQG graphs: undirected (u-NatQG) and directed (d-NatQG) – both were informed with quantum electronic structure information included in node, edge, and whole-graph attribute vectors. These graphs were used to develop graph neural networks (GNNs) based on modifications of the gated message passing neural network (MPNN) and the multiplex molecular graph neural network (MXMNet). For a number of molecular properties, including the HOMO-LUMO gap, polarizability, and dipole moment, the NatQG models performed at a level similar or higher than that of state-of-the-art GNNs based on radius graphs. Further, we provide the transition metal quantum mechanics graph (tmQMg) dataset, including the quantum geometries, properties, and NatQG graphs of 60k TMCs, which can be used as a benchmark for graph-based machine learning models. We believe that this work makes a significant contribution to the further development of deep graph learning for TMCs.

Supplementary materials

Title
Description
Actions
Title
Supporting Information
Description
Additional Figures, Tables, and Texts with further details on the tmQMg dataset and the GNN models based on the NatQG and RG graphs.
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.