A Rapid Multivariate Analysis Approach to Explore Differential Spatial Protein Profiles in Tissue

Kavya Sharman; Nathan Heath Patterson; Andy Weiss; Elizabeth K Neumann; Emma R Guiberson; Daniel Ryan; Danielle B Gutierrez; Jeffrey M Spraggins; Raf Van de Plas; Eric P Skaar; Richard M Caprioli

doi:10.26434/chemrxiv-2022-d2fct

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

A Rapid Multivariate Analysis Approach to Explore Differential Spatial Protein Profiles in Tissue

06 April 2022, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

We have developed a multivariate approach for rapid exploration of differential protein profiles acquired from distinct tissue regions. Spatially targeted proteomics is a technology for analyzing the proteome of specific cell types and functional regions within tissue. While spatial context is often essential to understanding biological processes, interpreting complex protein profiles (e.g., of key tissue subregions) can pose a challenge due to the high-dimensional nature of the data. To address this challenge, we developed a multivariate approach to explore such data and applied it to analyze a published spatially targeted proteomics dataset collected from Staphylococcus aureus-infected murine kidney, 4-days and 10-days post-infection. The multivariate data analysis process we developed rapidly filters complex biological data to determine the most relevant species from hundreds to thousands of measured molecules avoiding the more traditional univariate and targeted viewpoint of tracking individual proteins. We employ principal component analysis (PCA) for dimensionality reduction and grouping of correlated and anticorrelated proteins among regions and timepoints previously measured by mass spectrometry through micro-liquid extraction surface analysis (microLESA). Subsequently, k-means clustering of the PCA-processed data was used to group samples in an unsupervised manner. Interpretation of the resultant cluster centers revealed a subset of proteins among those detected that differentiate among spatial regions of infection over two timepoints. These proteins are involved in the glycolysis and TCA metabolomic pathways, calcium-dependent processes, and cytoskeletal organization. Gene ontology analysis of the protein subsets in each cluster uncovered patterns in the dataset used related to tissue damage and repair as well as calcium-related defense mechanisms during staphylococcal infection. By applying this analysis in an infectious disease case study, we observed differential proteomic changes across abscess regions over time, reflecting the dynamic nature of host-pathogen interactions.

Keywords

Staphylococcus aureus

mass spectrometry

bioinformatics

abscess formation

host−pathogen interface

microLESA

spatially targeted proteomics

proteomics

computational proteomics

Supplementary materials

Title

Description

Actions

Title

Spatial Proteomics Supplementary Information

Description

Supplemental figures describing missing values per dataset, silhouette score determination, and results with dataset using the full zero-filled dataset.

Actions

Title

Spatial Proteomics Code for Zero-Filled Dataset

Description

Jupyter Notebook of Code Analyzing Zero-Filled Dataset

Actions

Title

Spatial Proteomics Code for Dataset with No Imputation

Description

Jupyter Notebook of Code Analyzing Dataset with No Imputation

Actions

Supplementary weblinks

Title

Description

Actions

Title

Spatial Proteomics Code

Description

GitHub repository containing code for analysis of both non-imputed and zero-filled datasets, comma-separated values (CSV) files with protein IDs per protein class for both non-imputed and zero-filled datasets, and text file containing protein group data.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Apr 06, 2022 Version 1

Metrics

793

325

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2022-d2fct

Funding

National Institute of Allergy and Infectious Diseases

R01AI138581

National Institute of Allergy and Infectious Diseases

R01AI145992

National Institute of Allergy and Infectious Diseases

R01AI069233

National Institute of Allergy and Infectious Diseases

R01AI073843

National Institute of Diabetes and Digestive and Kidney Diseases

U54DK120058

National Eye Institute

U54EY032442

National Institute of Environmental Health Sciences

1F32AI157215

National Institute of Environmental Health Sciences

T32ES007028

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

A Rapid Multivariate Analysis Approach to Explore Differential Spatial Protein Profiles in Tissue

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share