Estimating optimal feature subsets using efficient estimation of high-dimensional mutual information

被引:120
作者
Chow, TWS [1 ]
Huang, D [1 ]
机构
[1] City Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2005年 / 16卷 / 01期
关键词
feature selection; Parzen window estimator; quadratic mutual information (QMI); supervised data compression;
D O I
10.1109/TNN.2004.841414
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A novel feature selection method using the concept of mutual information (MI) is proposed in this paper. In all MI based feature selection methods, effective and efficient estimation of high-dimensional MI is crucial. In this paper, a pruned Parzen window estimator and the quadratic mutual information (QMI) are combined to address this problem. The results show that the proposed approach can estimate the MI in an effective and efficient way. With this contribution, a novel feature selection method is developed to identify the salient features one by one. Also, the appropriate feature subsets for classification can be reliably estimated. The proposed methodology is thoroughly tested in four different classification applications in which the number of features ranged from less than 10 to over 15000. The presented results are very promising and corroborate the contribution of the proposed feature selection methodology.
引用
收藏
页码:213 / 224
页数:12
相关论文
共 29 条
[1]  
Al-Ani A, 2001, ISSPA 2001: SIXTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1 AND 2, PROCEEDINGS, P477, DOI 10.1109/ISSPA.2001.950184
[2]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[3]  
BONNLANDER BV, 1994, P INT S ART NEUR NET, P42
[4]  
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[5]  
Dash M., 1997, Intelligent Data Analysis, V1
[6]  
Devijver P., 1982, PATTERN RECOGN
[7]   INDEPENDENT COORDINATES FOR STRANGE ATTRACTORS FROM MUTUAL INFORMATION [J].
FRASER, AM ;
SWINNEY, HL .
PHYSICAL REVIEW A, 1986, 33 (02) :1134-1140
[8]  
Hall M., 1999, THESIS WAIKATO U NZ
[9]  
HAN JW, 2001, DATA MINING CONCEPTS, P116
[10]  
John GH, 1994, P 11 INT C MACH LEAR, P121, DOI 10.1016/B978-1-55860-335-6.50023-4