Microarrays

Microarray data represents expression levels of up to tens of thousands genes in a number of repeated tests and measures the presence and abundance of gene expression in tissue. Using microarrays, the complete genome of an organism can be monitored simulteneously in response to given treatment. For example, up to 40% of human genes can be expressed at one time.

Microarray data is extracted from images, which are normalized to obtain expression values. Microarrays are matrices with rows representing genes and columns representing different samples. Analysis of the data includes statistical tests for detecting differential expression (e.g. comparing different cell types), clustering algorithms to find structures and patterns, and classification methods to predict biological or clinical outcome (e.g. disease vs. normal).

From statistical point of view, analysis of microarrays is a challenge mainly for the high dimensionality of the data. It is always the case that the number of observed genes is many times more than the number of samples.

Below we illustrate the hierarchical clustering method for microarrays using different linkage functions. One needs to load a gct (Gene Cluster text file format) file from a local folder. A sample gct file can be downloaded here. Another sample file is leuGMP.gct. More data can be found at the Cancer Program Data Sets site of BROAD Institute.

Next we show a snapshot of the applet in action.


1. Click Open to load a *.gct file.

2. Choose Distance function and Linkage method.

3. Click Cluster to do hierarchical clustering by rows and columns.

4. Go with mouse over the gray bar on the left to view detail gene and sample information.

5. You may search the gene names and descriptions for a string. Enter a search word in the box on the right of the Search button and then click the button. Rows that contain the entered word will be colored in yellow.




References

[1]   Eisen, M., Spellman, P., Brown, P., and Botstein, D., Clustr analysis and display of genome-wide expression patterns, 1998, Proc. Natl. Acad. Sci., Genetics, Vol. 95, pp. 14863-14868.

[2]   Tamayo, P., Ramaswamy, Cancer Genomics and Molecular Pattern Recognition, 2002, BROAD Institute.


Return home