Wrappers for feature subset selection

被引：5538

作者：

Kohavi, R

John, GH

机构：

[1] Silicon Graph Inc, Data Min & Visualizat, Mountain View, CA 94043 USA

[2] Epiphany Mkt Software, Mountain View, CA 94043 USA

来源：

ARTIFICIAL INTELLIGENCE | 1997年 / 97卷 / 1-2期

关键词：

classification; feature selection; wrapper; filter;

D O I：

10.1016/S0004-3702(97)00043-X

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider how the algorithm and the training set interact. We explore the relation between optimal feature subset selection and relevance. Our wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain. We study the strengths and weaknesses of the wrapper approach and show a series of improved designs. We compare the wrapper approach to induction without feature subset selection and to Relief, a filter approach to feature subset selection. Significant improvement in accuracy is achieved for some datasets for the two families of induction algorithms used: decision trees and Naive-Bayes. (C) 1997 Elsevier Science B.V.

引用

页码：273 / 324

页数：52

共 122 条

[1] Aha D.W., 1995, P 5 INT WORKSHOP ART, P1, DOI [10.1007/978-1-4612-2404-4, DOI 10.1007/978-1-4612-2404-4_19]
[2] AHA DW, 1991, MACH LEARN, V6, P37, DOI 10.1007/BF00153759
[3] TOLERATING NOISY, IRRELEVANT AND NOVEL ATTRIBUTES IN INSTANCE-BASED LEARNING ALGORITHMS
AHA, DW
[J]. INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1992, 36 (02): : 267 - 287
[4] ALMUALLIM H, 1991, PROCEEDINGS : NINTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 AND 2, P547
[5] LEARNING BOOLEAN CONCEPTS IN THE PRESENCE OF MANY IRRELEVANT FEATURES
ALMUALLIM, H
DIETTERICH, TG
[J]. ARTIFICIAL INTELLIGENCE, 1994, 69 (1-2) : 279 - 305
[6] EXPLORATIONS OF AN INCREMENTAL, BAYESIAN ALGORITHM FOR CATEGORIZATION
ANDERSON, JR
MATESSA, M
[J]. MACHINE LEARNING, 1992, 9 (04) : 275 - 308
[7] [Anonymous], P 11 INT C MACH LEAR
[8] [Anonymous], 1994, P MACH LEARN
[9] [Anonymous], P 9 INT C MACH LEARN
[10] [Anonymous], 1982, Pattern recognition: A statistical approach

← 1 2 3 4 5 6 7 8 9 10 →