Exploring the Chemical Space of Antiparasitic Peptides and Discovery of New Promising Leads through a Novel Approach based on Network Science and Similarity Searching

Sebastián  Ayala-Ruano; Yovani Marrero-Ponce; Longendri  Aguilera‑Mendoza; Noel  Pérez; Guillermin  Agüero-Chapin; Agostinho  Antunes; Ana Cristina  Aguilar

doi:10.26434/chemrxiv-2021-tgv69-v2

Biological and Medicinal Chemistry

Search within Biological and Medicinal Chemistry

Exploring the Chemical Space of Antiparasitic Peptides and Discovery of New Promising Leads through a Novel Approach based on Network Science and Similarity Searching

11 February 2022, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Antimicrobial peptides (AMPs) have appeared as promising compounds to treat a wide range of diseases. Their clinical potentialities reside in the wide range of mechanisms they can use for both killing microbes and modulating immune responses. However, the hugeness of the AMPs’ chemical space (AMPCS), represented by more than 1065 unique sequences of 50 residues long or less, has represented a big challenge for the discovery of new promising sequences and for the identification of common structural motifs and even relevant biological functions. Here, we present a new approach based on network science and similarity searching to discover new promising AMPs, specifically antiparasitic peptides (APPs). We exploited the network-based representation of APPs’ chemical space (APPCS) to retrieve valuable information by using three types of networks: chemical space (CSN), half-space proximal (HSPN), and metadata (METN). In the network analysis, some centrality measures were applied to identify in each network the most important and non-redundant nodes/peptides. Then, these central peptides were considered as queries (Qs) in group fusion similarity-based searches against an existing collection of known bioactive compounds, stored in the graph database starPepDB, to discover new potential APPs. The performance of the resulting multi-query similarity-based search models (mQSSMs) was evaluated in five benchmarking data sets of APP/non-APPs. The predictions performed by the best mQSSM showed a strong-to-very strong performance since their external Matthews correlation coefficient (MCC) values ranged from 0.834 to 0.965. Outstanding MCC values (> 0.85) were attained by the mQSSM with 219 Qs from both networks CSN and HSPN with 0.5 as similarity threshold in external datasets. Then, the performance of our best mQSSM was compared with the APPs prediction servers AMPDiscover and AMPFun. The proposed model showed its relevance by outperforming state-of-the-art machine learning models to predict APPs with statistically significant differences. After applying the best mQSSM and additional filters on the non-APP space, 95 AMPs were repurposed as potential antiparasitic leads. Due to the high sequence diversity of these peptides, different computational approaches were applied to identify relevant motifs for searching and designing new promising APPs. These results support that network-based similarity searches identify APPs with high effectivity and reliability. The proposed models and pipeline are freely available through the starPep toolbox software at http://mobiosd-hub.com/starpep.

Keywords

Antimicrobial peptides

Antiparasitic peptides

Network science

Chemical space network

Half-space proximal network

Supplementary materials

Title

Description

Actions

Title

The Supporting Information is available free of charge at Zenodo: https://doi.org/10.5281/zenodo.5650160.

Description

Supporting information 1 (SI1) contains Fasta files. Supporting information 2 (SI2) contains an MS word file with Tables. Supporting information 3 (SI3) has graphml files Supporting information 4 (SI4) is an excel file Supporting information 5 (SI5) has FASTA files Supporting information 6 (SI6) contains excel files Supporting information 7 (SI7) has 3 kinds of files, namely SI7-A contains 3 folders with original results for each mQSSMs generated, as well excel file with statistical parameters. SI7-B is an excel file with the performance parameter of the best 21 mQSSMs proposed as well as the ranking of these models. Finally, SI7-C and SI7-D are pdf files with results of multiple comparisons of our mQSSMs and with literature algorithms, respectively. Supporting information 8 (SI8) contains FASTA files Supporting information 9 (SI9) contains a PowerPoint file.

Actions

Supplementary weblinks

Title

Description

Actions

Title

The Supporting Information is available free of charge at Zenodo

Description

Supporting information 1 (SI1) contains Fasta files. Supporting information 2 (SI2) contains an MS word file with Tables of parameters of similarity threshold analysis from CSN and HSPN of APPs. SI3 has graphml files. SI4 is an excel file with normalized centrality measures of CSN and HSPN. SI5 has FASTA files of the most central and non-redundant APPs. SI6 contains excel files with results of multi-query similarity searching models. SI7 has 3 kinds of files, namely SI7-A contains 3 folders with original results for each mQSSMs generated. SI7-B is an excel file with the performance parameter of the best 21 mQSSMs proposed. Finally, SI7-C and SI7-D are pdf files with results of multiple comparisons of our mQSSMs and with literature algorithms, respectively. SI8 contains FASTA files of 95 lead compounds. SI9 contains a PowerPoint file with alignments and sequence logos of 95 lead compounds.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Feb 11, 2022 Version 2

Nov 08, 2021 Version 1

Version Notes

Only minor changes, adjustments to improve the English language, ENJOY!

Metrics

1,742

722

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2021-tgv69-v2

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Exploring the Chemical Space of Antiparasitic Peptides and Discovery of New Promising Leads through a Novel Approach based on Network Science and Similarity Searching

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Version History

Version Notes

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share