GEMS: A system for automated cancer diagnosis and biomarker discovery from microarray gene expression data

被引:136
作者
Statnikov, A [1 ]
Tsamardinos, I [1 ]
Dosbayev, Y [1 ]
Aliferis, CF [1 ]
机构
[1] Vanderbilt Univ, Dept Biomed Informat, Discovery Syst Lab, Nashville, TN 37232 USA
关键词
gene expression microarray analysis; decision support systems; neoplasms; diagnosis; computer-assisted artificial intelligence;
D O I
10.1016/j.ijmedinf.2005.05.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The success of treatment of patients with cancer depends on establishing an accurate diagnosis. To this end, we have built a system called GEMS (gene expression model selector) for the automated development and evaluation of high-quality cancer diagnostic models and biomarker discovery from microarray gene expression data. In order to determine and equip the system with the best performing diagnostic methodologies in this domain, we first conducted a comprehensive evaluation of classification algorithms using 11 cancer microarray datasets. In this paper we present a preliminary evaluation of the system with five new datasets. The performance of the models produced automatically by GEMS is comparable or better than the results obtained by human analysts. Additionally, we performed a cross-dataset evaluation of the system. This involved using a dataset to build a diagnostic model and to estimate its future performance, then applying this model and evaluating its performance on a different dataset. We found that models produced by GEMS indeed perform well in independent samples and, furthermore, the cross-validation performance estimates output by the system approximate well the error obtained by the independent validation. GEMS is freely available for download for non-commercial use from http://www.gems-system.org. (C) 2005 Elsevier Ireland Ltd. All rights reserved.
引用
收藏
页码:491 / 503
页数:13
相关论文
共 27 条
[1]  
Aliferis C F, 2003, AMIA Annu Symp Proc, P21
[2]   MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia [J].
Armstrong, SA ;
Staunton, JE ;
Silverman, LB ;
Pieters, R ;
de Boer, ML ;
Minden, MD ;
Sallan, SE ;
Lander, ES ;
Golub, TR ;
Korsmeyer, SJ .
NATURE GENETICS, 2002, 30 (01) :41-47
[3]   The genetics and genomics of cancer [J].
Balmain, A ;
Gray, J ;
Ponder, B .
NATURE GENETICS, 2003, 33 (Suppl 3) :238-244
[4]   Gene-expression profiles predict survival of patients with lung adenocarcinoma [J].
Beer, DG ;
Kardia, SLR ;
Huang, CC ;
Giordano, TJ ;
Levin, AM ;
Misek, DE ;
Lin, L ;
Chen, GA ;
Gharib, TG ;
Thomas, DG ;
Lizyness, ML ;
Kuick, R ;
Hayasaka, S ;
Taylor, JMG ;
Iannettoni, MD ;
Orringer, MB ;
Hanash, S .
NATURE MEDICINE, 2002, 8 (08) :816-824
[5]  
Berrar Daniel P, 2003, Pac Symp Biocomput, P5
[6]   Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses [J].
Bhattacharjee, A ;
Richards, WG ;
Staunton, J ;
Li, C ;
Monti, S ;
Vasa, P ;
Ladd, C ;
Beheshti, J ;
Bueno, R ;
Gillette, M ;
Loda, M ;
Weber, G ;
Mark, EJ ;
Lander, ES ;
Wong, W ;
Johnson, BE ;
Golub, TR ;
Sugarbaker, DJ ;
Meyerson, M .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (24) :13790-13795
[7]  
Causton HC., 2003, MICROARRAY GENE EXPR
[8]  
DUDOIT S, 2003, 126 UC BERK DIV BIOS
[9]   The Stanford Microarray Database: data access and quality assessment tools [J].
Gollub, J ;
Ball, CA ;
Binkley, G ;
Demeter, J ;
Finkelstein, DB ;
Hebert, JM ;
Hernandez-Boussard, T ;
Jin, H ;
Kaloper, M ;
Matese, JC ;
Schroeder, M ;
Brown, PO ;
Botstein, D ;
Sherlock, G .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :94-96
[10]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537