# Microarray analysis techniques

Microarray analysis techniques are used in interpreting the data generated from experiments on DNA (Gene chip analysis), RNA, and protein microarrays, which allow researchers to investigate the expression state of a large number of genes - in many cases, an organism's entire genome - in a single experiment.wikipedia

Significance analysis of microarrays (SAM) is a statistical technique, established in 2001 by Virginia Tusher, Robert Tibshirani and Gilbert Chu, for determining whether changes in gene expression are statistically significant.

Lasso method, which proposed the use of L 1 penalization in regression and related problems, and Significance Analysis of Microarrays.

Most microarray manufacturers, such as Affymetrix and Agilent, provide commercial data analysis software alongside their microarray products.

Dye normalization for two color arrays is often achieved by local regression.

A common method for evaluating how well normalized an array is, is to plot an MA plot of the data.

Robust Multi-array Average (RMA) is a normalization approach that does not take advantage of these mismatch spots, but still must summarize the perfect matches through median polish.

Hierarchical clustering, and k-means clustering are widely used techniques in microarray analysis.

Hierarchical clustering is a statistical method for finding relatively homogeneous clusters.

Initially, a distance matrix containing all the pairwise distances between the genes is calculated.

Pearson’s correlation and Spearman’s correlation are often used as dissimilarity estimates, but other methods, like Manhattan distance or Euclidean distance, can also be applied.

Grouping is done by minimizing the sum of the squares of distances between the data and the corresponding cluster centroid.

K-means clustering algorithm and some of its variants (including k-medoids) have been shown to produce good results for gene expression data (at least better than hierarchical clustering methods).

Non-commercial tools such as FunRich, GenMAPP and Moksiskaan also aid in organizing and visualizing gene network data procured from one or several microarray experiments.

A wide variety of microarray analysis tools are available through Bioconductor written in the R programming language.

Specialized software tools for statistical analysis to determine the extent of over- or under-expression of a gene in a microarray experiment relative to a reference state have also been developed to aid in identifying genes or gene sets associated with particular phenotypes.

One such method of analysis, known as Gene Set Enrichment Analysis (GSEA), uses a Kolmogorov-Smirnov-style statistic to identify groups of genes that are regulated together.