Amino Acid Composition drives Peptide Aggregation: Predicting Aggregation for Improved Synthesis

12 February 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Peptide aggregation is a long-standing challenge in chemical peptide synthesis, limiting its efficiency and reliability. Although data-driven methods have enhanced our understanding of many sequence-based phenomena, no comprehensive approach addresses so-called “non-random difficult couplings” (generally linked to aggregation) during solid-phase peptide synthesis. Here, we leverage existing peptide synthesis datasets, supplemented with newly acquired experimental data, to build a predictive model that deciphers the role of individual amino acids in triggering aggregation. First, we identified and experimentally validated composition-dependent aggregation as a stronger predictor than sequence-based patterns. This insight enabled the development of a composition vector representation, allowing insights into the aggregation propensities of individual amino acids. Applying an ensemble of trained models, we predict the aggregation properties of peptides and recommend optimized synthesis conditions. By elucidating each individual amino acid’s influence, this method holds the potential to accelerate synthesis optimization through existing data, offering a robust framework for understanding and controlling peptide aggregation.

Keywords

Solid-Phase Peptide Synthesis
Difficult Sequences
Aggregation
Synthesis Optimization
Flow Chemistry
Machine Learning

Supplementary materials

Title
Description
Actions
Title
Supplementary Material
Description
Supplementary material including dataset statistics, computational analysis, experimental procedures, and analytical data.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.