Analyzing microarray data using cluster analysis

被引:113
作者
Shannon, W
Culverhouse, R
Duncan, J
机构
[1] Washington Univ, Sch Med, Dept Med, St Louis, MO 63110 USA
[2] Washington Univ, Sch Med, Div Biostat, St Louis, MO 63110 USA
关键词
consensus methods; distance calculations; heat maps; hierarchical clustering; k-means clustering; Mantel statistics; microarrays; unsupervised learning;
D O I
10.1517/phgs.4.1.41.22581
中图分类号
R9 [药学];
学科分类号
1007 ;
摘要
As pharmacogenetics researchers gather more detailed and complex data on gene polymorphisms that effect drug metabolizing enzymes, drug target receptors and drug transporters, they will need access to advanced statistical tools to mine that data. These tools include approaches from classical biostatistics, such as logistic regression or linear discriminant analysis, and supervised learning methods from computer science, such as support vector machines and artificial neural networks. In this review, we present an overview of another class of models, cluster analysis, which will likely be less familiar to pharmacogenetics researchers. Cluster analysis is used to analyze data that is nota priori known to contain any specific subgroups. The goal is to use the data itself to identify meaningful or informative subgroups. Specifically, we will focus on demonstrating the use of distance-based methods of hierarchical clustering to analyze gene expression data.
引用
收藏
页码:41 / 52
页数:12
相关论文
共 31 条
  • [1] [Anonymous], 2005, NEURAL NETWORKS PATT
  • [2] Molecular classification of cutaneous malignant melanoma by gene expression profiling
    Bittner, M
    Meitzer, P
    Chen, Y
    Jiang, Y
    Seftor, E
    Hendrix, M
    Radmacher, M
    Simon, R
    Yakhini, Z
    Ben-Dor, A
    Sampas, N
    Dougherty, E
    Wang, E
    Marincola, F
    Gooden, C
    Lueders, J
    Glatfelter, A
    Pollock, P
    Carpten, J
    Gillanders, E
    Leja, D
    Dietrich, K
    Beaudry, C
    Berens, M
    Alberts, D
    Sondak, V
    Hayward, N
    Trent, J
    [J]. NATURE, 2000, 406 (6795) : 536 - 540
  • [3] Insights into psoriasis and other inflammatory diseases from large-scale gene expression studies
    Bowcock, AM
    Shannon, W
    Du, FH
    Duncan, J
    Cao, K
    Aftergut, K
    Catier, J
    Fernandez-Vina, MA
    Menter, A
    [J]. HUMAN MOLECULAR GENETICS, 2001, 10 (17) : 1793 - 1805
  • [4] Knowledge-based analysis of microarray gene expression data by using support vector machines
    Brown, MPS
    Grundy, WN
    Lin, D
    Cristianini, N
    Sugnet, CW
    Furey, TS
    Ares, M
    Haussler, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) : 262 - 267
  • [5] Cristianini N, 2000, Intelligent Data Analysis: An Introduction
  • [6] Durbin B P, 2002, Bioinformatics, V18 Suppl 1, pS105
  • [7] Cluster analysis and display of genome-wide expression patterns
    Eisen, MB
    Spellman, PT
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
  • [8] Everitt B., 1997, ANAL PROXIMITY DATA, VVol. 4
  • [9] The use of multiple measurements in taxonomic problems
    Fisher, RA
    [J]. ANNALS OF EUGENICS, 1936, 7 : 179 - 188
  • [10] Hartigan J. A., 1979, Applied Statistics, V28, P100, DOI 10.2307/2346830