Input feature selection for classification problems

被引:664
作者
Kwak, N [1 ]
Choi, CH
机构
[1] Seoul Natl Univ, Sch Elect Engn, ERC ACI, Seoul 151744, South Korea
[2] Seoul Natl Univ, Sch Elect Engn, ASRI, Seoul 151744, South Korea
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2002年 / 13卷 / 01期
关键词
feature selection; mutual information; neural networks (NNs); orthogonal array; Taguchi method;
D O I
10.1109/72.977291
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection plays an important role in classifying systems such as neural networks (NNs). We use a set of attributes which are relevant, irrelevant or redundant and from the viewpoint of managing a dataset which can be huge, reducing the number of attributes by selecting only the relevant ones is desirable. In doing so, higher performances with lower computational effort is expected. In this paper, we propose two feature selection algorithms. The limitation of mutual information feature selector (MIFS) is analyzed and a method to overcome this limitation is studied. One of the proposed algorithms makes more considered use of mutual information between input attributes and output classes than the MIFS. What is demonstrated is that the proposed method can provide the performance of the ideal greedy selection algorithm when information is distributed uniformly. The computational load for this algorithm is nearly the same as that of MIFS. In addition, another feature selection algorithm using the Taguchi method is proposed. This is advanced as a solution to the question as to how to identify good features with as few experiments as possible. The proposed algorithms are applied to several classification problems and compared with MIFS. These two algorithms can be combined to complement each other's limitations. The combined algorithm performed well in several experiments and should prove to be a useful method in selecting features for classification problems.
引用
收藏
页码:143 / 159
页数:17
相关论文
共 27 条
[1]  
Agrawal R., 1993, IEEE T KNOWLEDGE DAT, V5
[2]  
ANWAR TM, 1992, P IEEE 8 INT C DAT E
[3]  
BATTITI R, 1994, IEEE T NEURAL NETWOR, V5
[4]  
BELUE LM, 1995, NEURAL COMPUT, V7
[5]  
Breiman L., 1984, BIOMETRICS, DOI DOI 10.2307/2530946
[6]  
CHEN MS, 1996, IEEE T KNOWLEDGE DAT, V85
[7]  
Cherkassky V., 1998, LEARNING DATA
[8]  
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[9]  
Draper N. R., 1966, APPL REGRESSION ANAL
[10]  
ELDER JF, 1993, JOINT STAT M SAN FRA