Toward integrating feature selection algorithms for classification and clustering

被引:1789
作者
Liu, H [1 ]
Yu, L [1 ]
机构
[1] Arizona State Univ, Dept Comp Sci & Engn, Tempe, AZ 85287 USA
基金
美国国家科学基金会;
关键词
feature selection; classification; clustering; categorizing framework; unifying platform; real-world applications;
D O I
10.1109/TKDE.2005.66
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces concepts and algorithms of feature selection, surveys existing feature selection algorithms for classification and clustering, groups and compares different algorithms with a categorizing framework based on search strategies, evaluation criteria, and data mining tasks, reveals unattempted combinations, and provides guidelines in selecting feature selection algorithms. With the categorizing framework, we continue our efforts toward building an integrated system for intelligent feature selection. A unifying platform is proposed as an intermediate step. An illustrative example is presented to show how existing feature selection algorithms can be integrated into a meta algorithm that can take advantage of individual algorithms. An added advantage of doing so is to help a user employ a suitable algorithm without knowing details of each algorithm. Some real-world applications are included to demonstrate the use of feature selection in data mining. We conclude this work by identifying trends and challenges of feature selection research and development.
引用
收藏
页码:491 / 502
页数:12
相关论文
共 96 条
  • [41] Research on collaborative negotiation for e-commerce.
    Feng, YQ
    Lei, Y
    Li, Y
    Cao, RZ
    [J]. 2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 2085 - 2088
  • [42] FEATURE-SELECTION FOR AUTOMATIC CLASSIFICATION OF NON-GAUSSIAN DATA
    FOROUTAN, I
    SKLANSKY, J
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1987, 17 (02): : 187 - 198
  • [43] Friedman J., 2001, The elements of statistical learning, V1, DOI DOI 10.1007/978-0-387-21606-5
  • [44] Friedman J.H., 2002, Clustering objects on subsets of attributes
  • [45] Gu BH, 2001, SPRING INT SER ENG C, V608, P21
  • [46] Hall M.A., 2000, Working Paper], DOI DOI 10.5555/645529.657793
  • [47] Han J., 2012, Data Mining, P393, DOI [DOI 10.1016/B978-0-12-381479-1.00009-5, 10.1016/B978-0-12-381479-1.00001-0]
  • [48] Huan Liu, 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P319
  • [49] OPTIMUM FEATURE-SELECTION BY ZERO-ONE INTEGER PROGRAMMING
    ICHINO, M
    SKLANSKY, J
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1984, 14 (05): : 737 - 746
  • [50] ICHINO M, 1984, P INT C PATTERN RECO, P124