Analyzing microarray data using cluster analysis

被引：113

作者：

Shannon, W

Culverhouse, R

Duncan, J

机构：

[1] Washington Univ, Sch Med, Dept Med, St Louis, MO 63110 USA

[2] Washington Univ, Sch Med, Div Biostat, St Louis, MO 63110 USA

来源：

PHARMACOGENOMICS | 2003年 / 4卷 / 01期

关键词：

consensus methods; distance calculations; heat maps; hierarchical clustering; k-means clustering; Mantel statistics; microarrays; unsupervised learning;

D O I：

10.1517/phgs.4.1.41.22581

中图分类号：

R9 [药学];

学科分类号：

1007 ;

摘要：

As pharmacogenetics researchers gather more detailed and complex data on gene polymorphisms that effect drug metabolizing enzymes, drug target receptors and drug transporters, they will need access to advanced statistical tools to mine that data. These tools include approaches from classical biostatistics, such as logistic regression or linear discriminant analysis, and supervised learning methods from computer science, such as support vector machines and artificial neural networks. In this review, we present an overview of another class of models, cluster analysis, which will likely be less familiar to pharmacogenetics researchers. Cluster analysis is used to analyze data that is nota priori known to contain any specific subgroups. The goal is to use the data itself to identify meaningful or informative subgroups. Specifically, we will focus on demonstrating the use of distance-based methods of hierarchical clustering to analyze gene expression data.

引用

页码：41 / 52

页数：12

共 31 条

[1] [Anonymous], 2005, NEURAL NETWORKS PATT
[2] Molecular classification of cutaneous malignant melanoma by gene expression profiling
Bittner, M
Meitzer, P
Chen, Y
Jiang, Y
Seftor, E
Hendrix, M
Radmacher, M
Simon, R
Yakhini, Z
Ben-Dor, A
Sampas, N
Dougherty, E
Wang, E
Marincola, F
Gooden, C
Lueders, J
Glatfelter, A
Pollock, P
Carpten, J
Gillanders, E
Leja, D
Dietrich, K
Beaudry, C
Berens, M
Alberts, D
Sondak, V
Hayward, N
Trent, J
[J]. NATURE, 2000, 406 (6795) : 536 - 540
[3] Insights into psoriasis and other inflammatory diseases from large-scale gene expression studies
Bowcock, AM
Shannon, W
Du, FH
Duncan, J
Cao, K
Aftergut, K
Catier, J
Fernandez-Vina, MA
Menter, A
[J]. HUMAN MOLECULAR GENETICS, 2001, 10 (17) : 1793 - 1805
[4] Knowledge-based analysis of microarray gene expression data by using support vector machines
Brown, MPS
Grundy, WN
Lin, D
Cristianini, N
Sugnet, CW
Furey, TS
Ares, M
Haussler, D
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) : 262 - 267
[5] Cristianini N, 2000, Intelligent Data Analysis: An Introduction
[6] Durbin B P, 2002, Bioinformatics, V18 Suppl 1, pS105
[7] Cluster analysis and display of genome-wide expression patterns
Eisen, MB
Spellman, PT
Brown, PO
Botstein, D
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
[8] Everitt B., 1997, ANAL PROXIMITY DATA, VVol. 4
[9] The use of multiple measurements in taxonomic problems
Fisher, RA
[J]. ANNALS OF EUGENICS, 1936, 7 : 179 - 188
[10] Hartigan J. A., 1979, Applied Statistics, V28, P100, DOI 10.2307/2346830

← 1 2 3 4 →