Significance analysis of microarrays (SAM) is a non-parametric, permutation-based method proposed specially for microarray data analysis (Tusher et al., 2001). It calculates the empirical False-Discovery Rate (FDR) by the random permutation of class labels. The permutation generates a null distribution, because the randomness is assumed to remove all biological effects. Therefore, it provides a means to control the false positives under various thresholds when multiple genes are assayed simultaneously in an array. The SAM package can handle both paired and non-paired data. It is run on top of the R statistical package, and has an excel interface using an excel plug-in.

  • SAM performs a set of gene-specific t-tests to identify genes with statistical significance, and calculates statistics dj for each gene j, thereby measuring the strength of the relationship between gene expression and response variables.
  • CD ComputaBio assigns a score to each gene based on the change in gene expression relative to the standard deviation of the repeated measurement of that gene. Genes with scores above the threshold are considered potentially important.
  • CD ComputaBio can adjust the threshold to identify smaller or larger gene sets, and calculate the FDR for each set.
  • Since the data may not follow a normal distribution, CD ComputaBio used non-parametric statistics in this analysis. The use of permutation-based analysis illustrates the correlation between genes and avoids parameter assumptions about the distribution of each gene. This is an advantage over other techniques (such as ANOVA and Bonferroni) that assume genes have the same variance or independence.

Project name SAM service
Sample requirements

The data should be put in an Excel spreadsheet. The first row of the spreadsheet has information about the response measurement; all remaining rows have gene expression data, one row per gene.

The columns represent the different experimental samples.

  • The first line of the file contains the response measurements, one per column, starting at column 3.
  • The remaining lines contain gene expression measurements one line per gene. We describe the format below.

Column 1 This should contain the gene name, for the user's reference.
Column 2 This should contain the gene ID, for the user's reference.
Remaining Columns These should contain the expression measurements as numbers. Missing expression measurements should be noted as either blank or non-numeric values.

For sequencing data, the values are counts and hence must be non-negative.

Screening cycle Decide according to your needs.
Service including We provide you with raw data and analysis service.
Price Inquiry

CD ComputaBio' SAM service can significantly increase the hit rate of lead compounds and reduce the cost of later experimental screening. SAM Service is a personalized and customized innovative scientific research service. Each project needs to be evaluated before the corresponding analysis plan and price can be determined. If you want to know more about service prices or technical details, please feel free to contact us.

