MIFS-ND: A mutual information-based feature selection method

被引:289
作者
Hoque, N. [1 ]
Bhattacharyya, D. K. [1 ]
Kalita, J. K. [2 ]
机构
[1] Tezpur Univ, Dept Comp Sci & Engn, Tezpur 784028, Assam, India
[2] Univ Colorado, Dept Comp Sci, Colorado Springs, CO 80933 USA
关键词
Features; Mutual information; Relevance; Classification; CLASSIFICATION; ALGORITHM;
D O I
10.1016/j.eswa.2014.04.019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection is used to choose a subset of relevant features for effective classification of data. In high dimensional data classification, the performance of a classifier often depends on the feature subset used for classification. In this paper, we introduce a greedy feature selection method using mutual information. This method combines both feature-feature mutual information and feature-class mutual information to find an optimal subset of features to minimize redundancy and to maximize relevance among features. The effectiveness of the selected feature subset is evaluated using multiple classifiers on multiple datasets. The performance of our method both in terms of classification accuracy and execution time performance, has been found significantly high for twelve real-life datasets of varied dimensionality and number of instances when compared with several competing feature selection techniques. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:6371 / 6385
页数:15
相关论文
共 41 条
[1]  
[Anonymous], 2009, P 12 INT C ART INT S
[2]  
[Anonymous], 2004, PHYS REV E
[3]  
[Anonymous], 1997, ICML
[4]  
[Anonymous], P 9 INT WORKSH MACH
[5]   Empirical study of feature selection methods based on individual feature evaluation for classification problems [J].
Arauzo-Azofra, Antonio ;
Aznarte, Jose Luis ;
Benitez, Jose M. .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (07) :8170-8177
[6]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[7]   On fuzzy-rough sets approach to feature selection [J].
Bhatt, RB ;
Gopal, M .
PATTERN RECOGNITION LETTERS, 2005, 26 (07) :965-975
[8]  
Bhattacharyya D. K., 2013, Network Anomaly Detection: A Machine Learning Perspective
[9]   Selection of relevant features and examples in machine learning [J].
Blum, AL ;
Langley, P .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :245-271
[10]   Feature subset selection Filter-Wrapper based on low quality data [J].
Cadenas, Jose M. ;
Carmen Garrido, M. ;
Martinez, Raquel .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (16) :6241-6252