A Sample-Centric and Knowledge-Driven Computational Framework for Natural Products Drug Discovery

Arnaud Gaudry; Marco Pagni; Florence Mehl; Sébastien Moretti; Luis-Manuel Quiros-Guerrero; Luca Cappelletti; Adriano Rutz; Marcel Kaiser; Laurence Marcourt; Emerson Ferreira Queiroz; Jean-Robert Ioset; Antonio Grondin; Bruno David; Jean-Luc Wolfender; Pierre-Marie Allard

doi:10.26434/chemrxiv-2023-sljbt-v2

Organic Chemistry

Search within Organic Chemistry

A Sample-Centric and Knowledge-Driven Computational Framework for Natural Products Drug Discovery

12 December 2023, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Modern natural products (NPs) research relies on untargeted liquid chromatography coupled with mass spectrometry metabolomics. Together with cutting-edge processing and computational annotation strategies, such approaches can yield extensive spectral and structural information. However, current processing workflows require feature-alignment steps based on retention time which hinders the comparison of samples originating from different batches or analyzed using different instrumental setups. In addition, there is currently no analytical framework available to efficiently match processed metabolomics data and associated metadata with external resources. To address these limitations, we present a new sample-centric and knowledge-driven framework allowing multi-modal data alignment - e.g. through chemical structures, biological activities, or spectral features - and demonstrate its value in exploring large and chemodiverse natural extract datasets. Here, the experimental data is processed at the sample level, matched with external identifiers when possible, semantically enriched, and integrated into a unified knowledge graph. The use of semantic web technology enables comparison of processed and standardized data, information, and knowledge at the repository scale. We demonstrate the utility of the developed framework, the Experimental Natural Products Knowledge Graph (ENPKG), to leverage the results obtained from screening 1,600 plant extracts against trypanosomatids and streamline the identification of new antiparasitic compounds. Thanks to its versatility, the proposed approach allows for a radically novel exploitation of metabolomics data. Semantic web technologies are a fundamental asset and we anticipate that their adoption will strongly complement the current computational metabolomics pipelines and enable the community to advance in the description of global chemodiversity and drug discovery projects.

Keywords

natural products

knowledge graphs

drug discovery

computational metabolomics

mass spectrometry

chemodiversity

Supplementary weblinks

Title

Description

Actions

Title

The ENPKG homepage

Description

The ENPKG homepage

Actions

View

Title

The ENPKG GraphDB instance

Description

The ENPKG GraphDB instance

Actions

View

Title

The ENPKG github organization

Description

The ENPKG github organization

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Dec 12, 2023 Version 2

Jul 03, 2023 Version 1

Version Notes

Integration of a novel dataset of 337 medicinal plants profiled under different experimental conditions. Integration of spectral search links against experimental spectral libraries and public spectral repositories for each feature of the graph.

Metrics

2,590

1,749

Views

Downloads

Citations

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2023-sljbt-v2

Funding

swissuniversities

Swiss Open Research Data Grants (CHORD) in Open Science I

Swiss National Science Foundation

CRSII5_189921/1

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

A Sample-Centric and Knowledge-Driven Computational Framework for Natural Products Drug Discovery

Authors

Abstract

Keywords

Supplementary weblinks

Comments

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share