Abstract
A recent report shows that with a suitably designed buffer solution proteins can be unfolded and translocated through a nanopore unidirectionally and uniformly, with residues exiting the pore in sequence order at a roughly constant rate of 1/µs (Nature Biotechnology 41, 1130–1139, 2023). The present work shows in theory that by sampling the signal of pore exclusion volume (a proxy for the measured blockade current) at a low frequency of 10-20 KHz and digitizing the sampled signal at a volume precision of 70 Å3 a substantial majority of the proteins in a proteome can be identified and counted without labeling. Computations on the full set of sequences in the human proteome (Uniprot id UP000005640_9606) show that ~70% of the proteins can be identified; the result generally holds even when post-translational modifications (PTMs) are present. The identification rate can be increased to better than 95% with modified algorithms; with an array of 100 pores ~109 proteins can be identified/counted in about 1.5 hours. This is a minimalist non-destructive single molecule label-free approach that is based on unmodified nanopores; it serves as a potential alternative to mass spectrometry while overcoming many of the limitations of the latter. In principle it can work with whole proteins in mixtures over the full dynamic range of a proteome without purification/separation, proteolytic degradation, or enzymes for translocation control.
Supplementary materials
Title
Protein id file for human proteome
Description
Contains ids for 20538 proteins in human proteome (Uniprot id UP000005640_9606) obtained with computational model
Actions
Title
Protein sequences in human proteome
Description
Reduced file of all 20598 sequences in human proteome (Uniprot id UP000005640_9606)
Actions