Neural Mulliken Analysis: Molecular Graphs from Density Matrices for QSPR on Raw Quantum-Chemical Data

17 March 2025, Version 3
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Here, molecular graphs derived from the one-electron density matrix are introduced within a more general effort to explore whether incorporating electronic structure awareness allows a single model to both better generalize from small data and better learn molecular encodings. Diagonal density matrix blocks serve as atomic node embeddings, while off-diagonal blocks provide embeddings for ''link'' nodes related to atomic pairs. In a minimal basis, these embeddings have dimensions of only 45 and 81, yet no information is lost, and the original density matrix can be fully reconstructed. Blocks from the basis set overlap matrix are used as edge embeddings to encode structural information and as weights for message aggregation operations. Additionally, element-wise multiplication performed during aggregating may provide access to electronic charges, analogous to Mulliken population analysis. The proposed concept was evaluated using data from the Solubility Challenge (2008, Llinàs et al.). A graph neural network (GNN) trained on 94 drug-like molecules achieved improved solubility prediction accuracy (RMSE 0.63, R^2 0.79). If combined with existing techniques for predicting electron density from molecular structures, this approach is promising for addressing a range of chemical machine-learning problems.

Keywords

graph neural networks
solubility
DFT
neural networks

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.