Applications of support vector machines to cancer classification with microarray data

被引:126
作者
Chu, F [1 ]
Wang, LP [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
关键词
cancer classification; gene expression data; microarray; support vector machine;
D O I
10.1142/S0129065705000396
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Microarray gene expression data usually have a large number of dimensions, e.g., over ten thousand genes, and a small number of samples, e.g., a few tens of patients. In this paper, we use the support vector machine (SVM) for cancer classification with microarray data. Dimensionality reduction methods, such as principal components analysis (PCA), class-separability measure, Fisher ratio, and t-test, are used for gene selection. A voting scheme is then employed to do multi-group classification by k(k - 1) binary SVMs. We are able to obtain the same classification accuracy but with much fewer features compared to other published results.
引用
收藏
页码:475 / 484
页数:10
相关论文
共 23 条
[1]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]  
[Anonymous], 1990, NEUROCOMPUTING, DOI [DOI 10.1007/978-3-642-76153-9_5, 10.1007/978-3-642-76153-9_5]
[3]  
[Anonymous], 1992, TRAINING ALGORITHM O
[4]  
[Anonymous], 1999, NEURAL NETWORKS COMP
[5]   Graphical methods for class prediction using dimension reduction techniques on DNA microarray data [J].
Bura, E ;
Pfeiffer, RM .
BIOINFORMATICS, 2003, 19 (10) :1252-1258
[6]   Gene expression patterns in human liver cancers [J].
Chen, X ;
Cheung, ST ;
So, S ;
Fan, ST ;
Barry, C ;
Higgins, J ;
Lai, KM ;
Ji, JF ;
Dudoit, S ;
Ng, IOL ;
van de Rijn, M ;
Botstein, D ;
Brown, PO .
MOLECULAR BIOLOGY OF THE CELL, 2002, 13 (06) :1929-1939
[7]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[8]   GEOMETRICAL AND STATISTICAL PROPERTIES OF SYSTEMS OF LINEAR INEQUALITIES WITH APPLICATIONS IN PATTERN RECOGNITION [J].
COVER, TM .
IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1965, EC14 (03) :326-&
[9]   Evolutionary algorithms for finding optimal gene sets in microarray prediction [J].
Deutsch, JM .
BIOINFORMATICS, 2003, 19 (01) :45-52
[10]   Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87