A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms

被引:443
作者
Dietrich Wettschereck
David W. Aha
Takao Mohri
机构
[1] GMD (German National Research Center for Information Technology),Navy Center for Applied Research in Artificial Intelligence
[2] Schloß Birlinghoven,Hidehiko Tanaka Lab. Department of Electric Engineering
[3] Naval Research Laboratory,undefined
[4] The University of Tokyo,undefined
关键词
lazy learning; k-nearest neighbor; feature weights; comparison;
D O I
10.1023/A:1006593614256
中图分类号
学科分类号
摘要
Many lazy learning algorithms are derivatives of the k-nearest neighbor (k-NN) classifier, which uses a distance function to generate predictions from stored instances. Several studies have shown that k-NN's performance is highly sensitive to the definition of its distance function. Many k-NN variants have been proposed to reduce this sensitivity by parameterizing the distance function with feature weights. However, these variants have not been categorized nor empirically compared. This paper reviews a class of weight-setting methods for lazy learning algorithms. We introduce a framework for distinguishing these methods and empirically compare them. We observed four trends from our experiments and conducted further studies to highlight them. Our results suggest that methods which use performance feedback to assign weight settings demonstrated three advantages over other methods: they require less pre-processing, perform better in the presence of interacting features, and generally require less training data to learn good settings. We also found that continuous weighting methods tend to outperform feature selection algorithms for tasks where some features are useful but less important than others.
引用
收藏
页码:273 / 314
页数:41
相关论文
共 72 条
[1]  
Aha D. W.(1992)Tolerating noisy, irrelevant, and novel attributes in instance-based learning algorithms International Journal of Man-Machine Studies 36 267-287
[2]  
Aha D. W.(1991)Instance-based learning algorithms Machine Learning 6 37-66
[3]  
Kibler D.(1994)Using mutual information for selecting features in supervised neural net learning IEEE Transactions on Neural Networks 5 537-550
[4]  
Albert M. K.(1992)Local learning algorithms Neural Computation 4 888-900
[5]  
Battiti R.(1990)A comparison of neural network and other pattern recognition approaches to the diagnosis of low back disorders Neural Networks 3 583-591
[6]  
Bottou L.(1988)Multivariable functional interpolation and adaptive networks Complex Systems 2 321-355
[7]  
Vapnik V.(1992)Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps IEEE Transactions on Neural Networks 3 693-713
[8]  
Bounds D.(1993)A weighted nearest neighbor algorithm for learning with symbolic features Machine Learning 10 57-78
[9]  
Lloyd P.(1967)Nearest neighbor pattern classification Institute of Electrical and Electronics Engineers Transactions on Information Theory 13 21-27
[10]  
Mathew B.(1977)On the possible orderings in the measurement selection problem IEEE Transactions on Systems, Man, and Cybernetics 7 657-661