Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters

被引:121
作者
Lukashin, AV [1 ]
Fuchs, R [1 ]
机构
[1] Biogen Inc, Cambridge, MA 02142 USA
关键词
D O I
10.1093/bioinformatics/17.5.405
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Cluster analysis of genome-wide expression data from DNA microarray hybridization studies has proved to be a useful tool for identifying biologically relevant groupings of genes and samples. In the present paper, we focus on several important issues related to clustering algorithms that have not yet been fully studied. Results: We describe a simple and robust algorithm for the clustering of temporal gene expression profiles that is based on the simulated annealing procedure. In general, this algorithm guarantees to eventually find the globally optimal distribution of genes over clusters. We introduce an iterative scheme that serves to evaluate quantitatively the optimal number of clusters for each specific data set. The scheme is based on standard approaches used in regular statistical tests. The basic idea is to organize the search of the optimal number of clusters simultaneously with the optimization of the distribution of genes over clusters. The efficiency of the proposed algorithm has been evaluated by means of a reverse engineering experiment, that is, a situation in which the correct distribution of genes over clusters is known a priori. The employment of this statistically rigorous test has shown that our algorithm places greater than 90% genes into correct clusters. Finally, the algorithm has been tested on real gene expression data (expression changes during yeast cell cycle) for which the fundamental patterns of gene expression and the assignment of genes to clusters are well understood from numerous previous studies.
引用
收藏
页码:405 / 414
页数:10
相关论文
共 21 条
[11]   Distinctive gene expression patterns in human mammary epithelial cells and breast cancers [J].
Perou, CM ;
Jeffrey, SS ;
Van de Rijn, M ;
Rees, CA ;
Eisen, MB ;
Ross, DT ;
Pergamenschikov, A ;
Williams, CF ;
Zhu, SX ;
Lee, JCF ;
Lashkari, D ;
Shalon, D ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (16) :9212-9217
[12]   Signaling and circuitry of multiple MAPK pathways revealed by a matrix of global gene expression profiles [J].
Roberts, CJ ;
Nelson, B ;
Marton, MJ ;
Stoughton, R ;
Meyer, MR ;
Bennett, HA ;
He, YDD ;
Dai, HY ;
Walker, WL ;
Hughes, TR ;
Tyers, M ;
Boone, C ;
Friend, SH .
SCIENCE, 2000, 287 (5454) :873-880
[13]   Systematic variation in gene expression patterns in human cancer cell lines [J].
Ross, DT ;
Scherf, U ;
Eisen, MB ;
Perou, CM ;
Rees, C ;
Spellman, P ;
Iyer, V ;
Jeffrey, SS ;
Van de Rijn, M ;
Waltham, M ;
Pergamenschikov, A ;
Lee, JCE ;
Lashkari, D ;
Shalon, D ;
Myers, TG ;
Weinstein, JN ;
Botstein, D ;
Brown, PO .
NATURE GENETICS, 2000, 24 (03) :227-235
[14]   Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization [J].
Spellman, PT ;
Sherlock, G ;
Zhang, MQ ;
Iyer, VR ;
Anders, K ;
Eisen, MB ;
Brown, PO ;
Botstein, D ;
Futcher, B .
MOLECULAR BIOLOGY OF THE CELL, 1998, 9 (12) :3273-3297
[15]   Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation [J].
Tamayo, P ;
Slonim, D ;
Mesirov, J ;
Zhu, Q ;
Kitareewan, S ;
Dmitrovsky, E ;
Lander, ES ;
Golub, TR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (06) :2907-2912
[16]   Systematic determination of genetic network architecture [J].
Tavazoie, S ;
Hughes, JD ;
Campbell, MJ ;
Cho, RJ ;
Church, GM .
NATURE GENETICS, 1999, 22 (03) :281-285
[17]  
Weaver D C, 1999, Pac Symp Biocomput, P112
[18]   Large-scale temporal gene expression mapping of central nervous system development [J].
Wen, XL ;
Fuhrman, S ;
Michaels, GS ;
Carr, DB ;
Smith, S ;
Barker, JL ;
Somogyi, R .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (01) :334-339
[19]   Microarray analysis of Drosophila development during metamorphosis [J].
White, KP ;
Rifkin, SA ;
Hurban, P ;
Hogness, DS .
SCIENCE, 1999, 286 (5447) :2179-2184
[20]   Biomedical discovery with DNA arrays [J].
Young, RA .
CELL, 2000, 102 (01) :9-15