Deciphering molecular embeddings with centered kernel alignment

Matthias Welsch; Steffen Hirte; Johannes Kirchmair

doi:10.26434/chemrxiv-2024-f8bnk-v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Deciphering molecular embeddings with centered kernel alignment

14 May 2024, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The creation of effective models is of utmost importance in various scientific and engineering domains. However, analyzing such models, especially nonlinear ones, poses significant challenges. In this context, centered kernel alignment (CKA) has emerged as a promising model analysis tool that assesses the independence between two embeddings. CKA's efficacy depends on the selection of a kernel that adequately captures the underlying properties of the compared models. We examine the properties of the linear and random forest (RF) kernel with respect to multilayer perceptrons (MLPs) and RFs to adapt the model analysis tool CKA to cheminformatics. Furthermore, we demonstrate the utility of CKA in cheminformatics in three case studies in which we (1) investigate why optimizing the radius of circular fingerprints beyond two bonds results in only minor changes in the performance of models, (2) analyze the dependence between physicochemical properties and the molecular representations induced by graph neural networks (GNNs) that use addition as readout operation, and (3) compare different graph readout operations in GNNs.

Keywords

centered kernel alignment

random forest kernel

machine learning

graph neural networks

molecular representations

representational alignment

Supplementary materials

Title

Description

Actions

Title

Supporting Information

Description

The Supporting Information contains further details on model performance (PDF).

Actions

Supplementary weblinks

Title

Description

Actions

Title

Source code

Description

Source code of the approach presented in this work.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

Deciphering Molecular Embeddings with Centered Kernel Alignment

Matthias Welsch, Steffen Hirte, Johannes Kirchmair journal article

Journal of Chemical Information and Modeling

Online publication date: Sep 25, 2024

Version History

May 14, 2024 Version 2

May 14, 2024 Version 1

Version Notes

Fixed one typo in the funding information

Metrics

541

252

Views

Downloads

Citations

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2024-f8bnk-v2

Funding

Austrian Federal Ministry of Labour and Economy

Austrian National Foundation for Research, Technology and Development

Christian Doppler Research Association

Boehringer-Ingelheim RCV GmbH & Co KG

BASF SE

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Deciphering molecular embeddings with centered kernel alignment

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Now Published

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share