Using rough sets with heuristics for feature selection

被引:241
作者
Zhong, N
Dong, J
Ohsuga, S
机构
[1] Maebashi Inst Technol, Dept Informat Engn, Maebashi, Gumma 3710816, Japan
[2] Waseda Univ, Dept Informat & Comp Sci, Shinjuku Ku, Tokyo 169, Japan
关键词
feature selection; rough sets; heuristics; inductive learning; knowledge discovery in databases;
D O I
10.1023/A:1011219601502
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Practical machine learning algorithms are known to degrade in performance (prediction accuracy) when faced with many features (sometimes attribute is used instead of feature) that are not necessary for rule discovery. To cope with this problem, many methods for selecting a subset of features have been proposed. Among such methods, the filter approach that selects a feature subset using a preprocessing step, and the wrapper approach that selects an optimal feature subset from the space of possible subsets of features using the induction algorithm itself as a part of the evaluation function, are two typical ones. Although the filter approach is a faster one, it has some blindness and the performance of induction is not considered. On the other hand, the optimal feature subsets can be obtained by using the wrapper approach, but it is not easy to use because of the complexity of time and space. In this paper, we propose an algorithm which is using rough set theory with greedy heuristics for feature selection. Selecting features is similar to the filter approach, but the evaluation criterion is related to the performance of induction. That is, we select the features that do not damage the performance of induction.
引用
收藏
页码:199 / 214
页数:16
相关论文
共 17 条
  • [1] Aho A. V., 1983, DATA STRUCTURES ALGO
  • [2] [Anonymous], 1998, ROUGH SETS KNOWLEDGE
  • [3] Boussouf M, 1998, LECT NOTES ARTIF INT, V1510, P230
  • [4] DONG JZ, 1999, LECT NOTES ARTIF INT, V1609, P621
  • [5] Fayyad U, 1996, AI MAG, V17, P37
  • [6] Kohavi R., 1994, RSSC '94. The Third International Workshop on Rough Sets and Soft Computing. Conference Proceedings, P310
  • [7] Kohavi R, 1994, P AAAI FALL S REL, P109
  • [8] LIU H, 1998, FEATURE SELECTION
  • [9] ROUGH SETS
    PAWLAK, Z
    [J]. INTERNATIONAL JOURNAL OF COMPUTER & INFORMATION SCIENCES, 1982, 11 (05): : 341 - 356
  • [10] Pawlak Z, 1991, Rough sets: Theoretical aspects of reasoning about data, V9, DOI DOI 10.1007/978-94-011-3534-4