Robust Feature Selection for Microarray Data Based on Multicriterion Fusion

被引:107
作者
Yang, Feng [1 ]
Mao, K. Z. [1 ]
机构
[1] Nanyang Technol Univ, Div Control & Instrumentat, Sch Elect & Elect Engn, Coll Engn,Biomed Elect Lab, Singapore 639798, Singapore
关键词
Feature selection; multicriterion fusion; recursive feature elimination; robustness; classification; GENE SELECTION; CLASSIFICATION; CANCER; PREDICTION; VALIDATION; STABILITY; STRATEGY;
D O I
10.1109/TCBB.2010.103
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Feature selection often aims to select a compact feature subset to build a pattern classifier with reduced complexity, so as to achieve improved classification performance. From the perspective of pattern analysis, producing stable or robust solution is also a desired property of a feature selection algorithm. However, the issue of robustness is often overlooked in feature selection. In this study, we analyze the robustness issue existing in feature selection for high-dimensional and small-sized gene-expression data, and propose to improve robustness of feature selection algorithm by using multiple feature selection evaluation criteria. Based on this idea, a multicriterion fusion-based recursive feature elimination (MCF-RFE) algorithm is developed with the goal of improving both classification performance and stability of feature selection results. Experimental studies on five gene-expression data sets show that the MCF-RFE algorithm outperforms the commonly used benchmark feature selection algorithm SVM-RFE.
引用
收藏
页码:1080 / 1092
页数:13
相关论文
共 43 条
[1]  
Ahmad F.K., 2008, Proceedings of the International Symposium on Information Technology, V2, P1, DOI DOI 10.1109/ITSIM.2008.4631678
[2]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[3]   A blocking strategy to improve gene selection for classification of gene expression data [J].
Bontempi, Gianluca .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2007, 4 (02) :293-300
[4]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[5]   Is cross-validation valid for small-sample microarray classification? [J].
Braga-Neto, UM ;
Dougherty, ER .
BIOINFORMATICS, 2004, 20 (03) :374-380
[6]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[7]  
Chernick M. R., 2007, BOOTSTRAP METHODS GU
[8]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[9]  
Dietterich TG, 1997, AI MAG, V18, P97
[10]   Ensemble methods in machine learning [J].
Dietterich, TG .
MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15