A novel feature selection method based on normalized mutual information

被引:137
作者
La The Vinh [1 ]
Lee, Sungyoung [1 ]
Park, Young-Tack [2 ,3 ]
d'Auriol, Brian J. [1 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Seoul, South Korea
[2] Soongsil Univ, Sch IT, Seoul, South Korea
[3] Soongsil Univ, Sch Comp, Seoul, South Korea
关键词
Feature selection; Mutual information; Minimal redundancy; Maximal relevance; SUBSET-SELECTION;
D O I
10.1007/s10489-011-0315-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel feature selection method based on the normalization of the well-known mutual information measurement is presented. Our method is derived from an existing approach, the max-relevance and min-redundancy (mRMR) approach. We, however, propose to normalize the mutual information used in the method so that the domination of the relevance or of the redundancy can be eliminated. We borrow some commonly used recognition models including Support Vector Machine (SVM), k-Nearest-Neighbor (kNN), and Linear Discriminant Analysis (LDA) to compare our algorithm with the original (mRMR) and a recently improved version of the mRMR, the Normalized Mutual Information Feature Selection (NMIFS) algorithm. To avoid data-specific statements, we conduct our classification experiments using various datasets from the UCI machine learning repository. The results confirm that our feature selection method is more robust than the others with regard to classification accuracy.
引用
收藏
页码:100 / 120
页数:21
相关论文
共 34 条
[1]  
[Anonymous], 2010, ADV FEATURE SELECTIO
[2]  
[Anonymous], 2006, FUZZY OPTIM DECIS MA, DOI DOI 10.1007/S10700-006-7336-8
[3]  
[Anonymous], 2007, Uci machine learning repository
[4]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[5]   Genetic algorithm based feature selection for target detection in SAR images [J].
Bhanu, B ;
Lin, YQ .
IMAGE AND VISION COMPUTING, 2003, 21 (07) :591-608
[6]  
Cawley G.C., 2007, ADV NEURAL INFORM PR, P209
[7]   Efficient Entropy-based Features Selection for Image Retrieval [J].
Chang, Tsun-Wei ;
Huang, Yo-Ping ;
Sandnes, Frode Eika .
2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, :2941-+
[8]  
Dasgupta A, 2007, KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P230
[9]  
Dash M., 1997, Intelligent Data Analysis, V1
[10]   Classifier subset selection for biomedical named entity recognition [J].
Dimililer, Nazife ;
Varoglu, Ekrem ;
Altincay, Hakan .
APPLIED INTELLIGENCE, 2009, 31 (03) :267-282