Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy

被引:8467
作者
Peng, HC
Long, FH
Ding, C
机构
[1] Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Berkeley, CA 94720 USA
[2] Lawrence Berkeley Lab, Computat Res Div, Berkeley, CA 94720 USA
关键词
feature selection; mutual information; minimal redundancy; maximal relevance; maximal dependency; classification;
D O I
10.1109/TPAMI.2005.159
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion based on mutual information. Because of the difficulty in directly implementing the maximal dependency condition, we first derive an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection. Then, we present a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e. g., wrappers). This allows us to select a compact set of superior features at very low cost. We perform extensive experimental comparison of our algorithm and other methods using three different classifiers (naive Bayes, support vector machine, and linear discriminate analysis) and four different data sets (handwritten digits, arrhythmia, NCI cancer cell lines, and lymphoma tissues). The results confirm that mRMR leads to promising improvement on feature selection and classification accuracy.
引用
收藏
页码:1226 / 1238
页数:13
相关论文
共 31 条
[1]
Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]
[Anonymous], P AAAI FALL S REL
[3]
A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167
[4]
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[5]
BEST 2 INDEPENDENT MEASUREMENTS ARE NOT 2 BEST [J].
COVER, TM .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1974, SMC4 (01) :116-117
[6]
Minimum redundancy feature selection from microarray gene expression data [J].
Ding, C ;
Peng, HC .
PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, :523-528
[7]
Duin RPW, 2000, LECT NOTES COMPUT SC, V1857, P16
[8]
Hadley SW, 1996, P ANN M AM ASS PHYS
[9]
A Bayesian morphometry algorithm [J].
Herskovits, EH ;
Peng, HC ;
Davatzikos, C .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2004, 23 (06) :723-737
[10]
A comparison of methods for multiclass support vector machines [J].
Hsu, CW ;
Lin, CJ .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (02) :415-425