Computational Proteomics

The broad application of proteomics in different biological and medical fields, as well as the diffusion of high-throughput platforms, leads to increasing volumes of available proteomics data. Computational proteomics is the data science concerned with the identification and quantification of proteins from numerous data and the biological interpretation of their concentration changes, posttranslational modifications, interactions, and subcellular localizations. Computational proteomics is a highly multidisciplinary endeavor attracting scientists from many fields and incorporates other disciplines like statistics, machine learning, efficient scientific programming, and network and time series analysis.

Data Source

MS-based Experimental Data

Mass Spectrometry (MS) is one of the main technologies in proteomics and is more and more used for its increasing precision and for the possibility to automate the proteomics analysis pipeline, yielding to large-scale high-throughput experiments data.

UniProt

UniProt (universal protein resource) is a comprehensive, high quality, and freely accessible resource of protein sequences and functional information.

PhosphoSitePlus

PhosphoSitePlus provides comprehensive information and tools for the study of protein post-translational modifications (PTMs).

Signor Database

The SIGnaling Network Open Resource, organizes and stores in a structured format signaling information published in the scientific literature.

Phospho.ELM Database

This database contains 8,718 substrate proteins from different species covering 3,370 tyrosine, 31,754 serine and 7,449 threonine instances.

Computational Methods

Identification and quantification of peptides, proteins, and PTMs

Mass spectrometric feature detection
Peptides identification
Protein inference and control of false discovery rates
Quantification of peptides and proteins.

Downstream data analysis

Exploratory statistics
T-test Analysis
ANOVA Analysis

Posttranslational modifications analysis
Suitable normalization and filtering
Hierarchical clustering
Principal Component Analysis (PCA)

Network analysis
Metabolite Set Enrichment Analysis (MSEA)
Protein–protein interaction (PPI) networks analysis
two-sample t-test

Machine learning
Multivariate feature selection
Cross-validation

Multiomics data integration
Multivariate analysis
iBAQ method

Workflow for Downstream Proteomics Analysis

Computational Proteomics 2

Computational proteomics has matured substantially. Platforms for identification and quantification of proteins can analyze the data in a reliable and automated way. CD ComputaBio offers comprehensive computational proteomics services, including network biology and cross-omics data analysis. We also have multiple resources for academic research and preclinical works in the identification of a suitable disease target and its corresponding hit. Contact us for more service details.

Reference

Sinitcyn, Pavel; Daniel Rudolph, Jan; Cox, Jürgen. Computational Methods for Understanding Mass Spectrometry–Based Shotgun Proteomics Data. Annual Review of Biomedical Data Science. 2018.

* For Research Use Only.

Inquiry