Statistical estimation of cluster boundaries in gene expressions profile data

被引:38
作者
Horimoto, K
Toh, H
机构
[1] Saga Med Sch, Lab Math, Saga 8498501, Japan
[2] Biomol Engn Res Inst, Dept Bioinformat, Suita, Osaka 5650874, Japan
关键词
D O I
10.1093/bioinformatics/17.12.1143
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Gene expression profile data are rapidly accumulating due to advances in microarray techniques. The abundant data are analyzed by clustering procedures to extract the useful information about the genes inherent in the data. In the clustering analyses, the systematic determination of the boundaries of gene clusters, instead of by visual inspection and biological knowledge, still remains challenging. Results: We propose a statistical procedure to estimate the number of clusters in the hierarchical clustering of the expression profiles. Following the hierarchical clustering, the statistical property of the profiles at the node in the dendrogram is evaluated by a statistics-based value: the variance inflation factor in the multiple regression analysis. The evaluation leads to an automatic determination of the cluster boundaries without any additional analyses and any biological knowledge of the measured genes. The performance of the present procedure is demonstrated on the profiles of 2467 yeast genes, with very promising results.
引用
收藏
页码:1143 / 1151
页数:9
相关论文
共 25 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]  
[Anonymous], 2000, Introduction to Graphical Modelling
[3]  
[Anonymous], 1998, REGRESSION ANAL
[4]   Clustering gene expression patterns [J].
Ben-Dor, A ;
Shamir, R ;
Yakhini, Z .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1999, 6 (3-4) :281-297
[5]  
Chatterjee S., 1977, REGRESSION ANAL EXAM
[6]   Genetic and physical maps of Saccharomyces cerevisiae [J].
Cherry, JM ;
Ball, C ;
Weng, S ;
Juvik, G ;
Schmidt, R ;
Adler, C ;
Dunn, B ;
Dwight, S ;
Riles, L ;
Mortimer, RK ;
Botstein, D .
NATURE, 1997, 387 (6632) :67-73
[7]   Exploring the metabolic and genetic control of gene expression on a genomic scale [J].
DeRisi, JL ;
Iyer, VR ;
Brown, PO .
SCIENCE, 1997, 278 (5338) :680-686
[8]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[9]  
Eisen MB, 1999, METHOD ENZYMOL, V303, P179
[10]  
GORDON AD, 1981, CLASSIFICATION