Comparative analysis of clustering methods for gene expression time course data

被引:42
作者
Costa, IG
de Carvalho, FDT
de Souto, MCP
机构
[1] Univ Fed Rio Grande do Norte, Dept Informat & Matemat Aplicada, BR-59072970 Natal, RN, Brazil
[2] Univ Fed Pernambuco, Ctr Informat, Recife, PE, Brazil
关键词
clustering methods; gene expression time series; unsupervised cross-validation; cluster validation;
D O I
10.1590/S1415-47572004000400025
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series). Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the partitions obtained in these experiments with gene annotation, such as protein function and series classification.
引用
收藏
页码:623 / 631
页数:9
相关论文
共 28 条
[1]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[2]   A genome-wide transcriptional analysis of the mitotic cell cycle [J].
Cho, RJ ;
Campbell, MJ ;
Winzeler, EA ;
Steinmetz, L ;
Conway, A ;
Wodicka, L ;
Wolfsberg, TG ;
Gabrielian, AE ;
Landsman, D ;
Lockhart, DJ ;
Davis, RW .
MOLECULAR CELL, 1998, 2 (01) :65-73
[3]   A symbolic approach to gene expression time series analysis [J].
Costa, IG ;
de Carvalho, FDT ;
de Souto, MCP .
VII BRAZILIAN SYMPOSIUM ON NEURAL NETWORKS, PROCEEDINGS, 2002, :25-30
[4]  
Costa IG, 2002, J INTELL FUZZY SYST, V13, P133
[5]   Comparisons and validation of statistical clustering techniques for microarray gene expression data [J].
Datta, S ;
Datta, S .
BIOINFORMATICS, 2003, 19 (04) :459-466
[6]  
deCarvalho F. A. T, 2002, P BRAZ WORKSH BIOINF, P88
[7]  
Diday E, 1980, CLUSTERING ANAL DIGI, P47
[8]   HOW MANY CLUSTERS ARE BEST - AN EXPERIMENT [J].
DUBES, RC .
PATTERN RECOGNITION, 1987, 20 (06) :645-663
[9]  
Efron B., 1994, INTRO BOOTSTRAP, DOI DOI 10.1201/9780429246593
[10]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868