Abstract
The desirable pharmacological properties and a broad number of therapeutic activities have placed peptides as promising drugs over small organic molecules and antibody drugs. Nevertheless, toxic effects such as hemolysis have hampered the development of such promising drugs. Hence, a reliable computational tool to predict peptide hemolytic toxicity is enormously useful before synthesis and ex-perimental evaluation. Currently, four web servers that predict hemolytic activity using Machine Learning (ML) algorithms; however, they exhibit some limitations such as the need for a reliable negative set and limited application domain. Hence, we developed a robust model based on a novel theoretical approach that combines network science and a multi-query similarity searching (MQSS) method. A total of 1152 initial models were constructed from 144 scaffolds generated in a previous report. These were evaluated on external datasets, and the best models were fused and improved. Our best MQSS model I1 outperformed all state-of-the-art ML-based models and was used to characterize the prevalence of hemolytic toxicity on therapeutic peptides. Based on our model’s estimation, the number of hemolytic peptides might be 3.9-fold higher than the reported.
Supplementary materials
Title
Supporting Information
Description
SM1 – Datasets used in this study for validating prediction models. SM2 – Scaffolds used to build MQSS models. SM3 – Information about the parameters of each MQSS model (scaffold, alignment type and cutoff r). SM4 – Performance evaluation of the models on all benchmark datasets. SM5 – Datasets only containing non-hemolytic peptides, quite useful for exploring/selecting non-toxic therapeutic peptides. SM6 –Additional information of the parameter exploration in the initial models. SM7 – Friedman test results.
Actions