An extensive comparison of recent classification tools applied to microarray data

被引:285
作者
Lee, JW
Lee, JB
Park, M
Song, SH
机构
[1] Korea Univ, Dept Stat, Seoul 136701, South Korea
[2] Eulji Med Coll, Dept Pre Med, Taejon 301832, South Korea
基金
新加坡国家研究基金会;
关键词
microarray; classification; feature selection;
D O I
10.1016/j.csda.2004.03.017
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Since most classification articles have applied a single technique to a single gene expression dataset, it is crucial to assess the performance of each method through a comprehensive comparative study. We evaluate by extensive comparison study extending Dudoit et at. (J. Amer. Statist. Assoc. 97 (2002) 77) the performance of recently developed classification methods in microarray experiment, and provide the guidelines for finding the most appropriate classification tools in various situations. We extend their comparison in three directions: more classification methods (21 methods), more datasets (7 datasets) and more gene selection techniques (3 methods). Our comparison study shows several interesting facts and provides the biolopsts and the biostatisticians some insights into the classification tools in microarray data analysis. T-his study also shows that the more sophisticated classifiers give better performances than classical methods such as kNN, DLDA DQDA and the choice of gene selection method has much effect on the performance of the classification methods, and thus the classification methods should be considered together with the gene selection criteria. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:869 / 885
页数:17
相关论文
共 34 条
  • [1] Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling
    Alizadeh, AA
    Eisen, MB
    Davis, RE
    Ma, C
    Lossos, IS
    Rosenwald, A
    Boldrick, JG
    Sabet, H
    Tran, T
    Yu, X
    Powell, JI
    Yang, LM
    Marti, GE
    Moore, T
    Hudson, J
    Lu, LS
    Lewis, DB
    Tibshirani, R
    Sherlock, G
    Chan, WC
    Greiner, TC
    Weisenburger, DD
    Armitage, JO
    Warnke, R
    Levy, R
    Wilson, W
    Grever, MR
    Byrd, JC
    Botstein, D
    Brown, PO
    Staudt, LM
    [J]. NATURE, 2000, 403 (6769) : 503 - 511
  • [2] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [3] Selection bias in gene extraction on the basis of microarray gene-expression data
    Ambroise, C
    McLachlan, GJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) : 6562 - 6566
  • [4] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [5] Breiman L, 1998, ANN STAT, V26, P801
  • [6] Breiman L., 1998, CLASSIFICATION REGRE
  • [7] Exploring the new world of the genome with DNA microarrays
    Brown, PO
    Botstein, D
    [J]. NATURE GENETICS, 1999, 21 (Suppl 1) : 33 - 37
  • [8] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [9] Boosting for tumor classification with gene expression data
    Dettling, M
    Bühlmann, P
    [J]. BIOINFORMATICS, 2003, 19 (09) : 1061 - 1069
  • [10] DING B, 2003, CLASSIFICATION USING