On visualization and aggregation of nearest neighbor classifiers

被引：43

作者：

Ghosh, AK

Chaudhuri, P

Murthy, CA

机构：

[1] Indian Stat Inst, Theoret Studies & Math Unit, Kolkata 700108, W Bengal, India

[2] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700108, W Bengal, India

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2005年 / 27卷 / 10期

关键词：

Bayesian strength function; misclassification rates; multiscale visualization; neighborhood parameter; posterior probability; prior distribution; weighted averaging;

D O I：

10.1109/TPAMI.2005.204

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Nearest neighbor classification is one of the simplest and most popular methods for statistical pattern recognition. A major issue in k-nearest neighbor classification is how to find an optimal value of the neighborhood parameter k. In practice, this value is generally estimated by the method of cross-validation. However, the ideal value of k in a classification problem not only depends on the entire data set, but also on the specific observation to be classified. Instead of using any single value of k, this paper studies results for a finite sequence of classifiers indexed by k. Along with the usual posterior probability estimates, a new measure, called the Bayesian measure of strength, is proposed and investigated in this paper as a measure of evidence for different classes. The results of these classifiers and their corresponding estimated misclassification probabilities are visually displayed using shaded strips. These plots provide an effective visualization of the evidence in favor of different classes when a given data point is to be classified. We also propose a simple weighted averaging technique that aggregates the results of different nearest neighbor classifiers to arrive at the final decision. Based on the analysis of several benchmark data sets, the proposed method is found to be better than using a single value of k.

引用

页码：1592 / 1602

页数：11

共 40 条

[1] Aho A.V., 1974, The Design and Analysis of Computer Algorithms
[2] Voting over multiple condensed nearest neighbors
Alpaydin, E
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 1997, 11 (1-5) : 115 - 132
[3] Breiman L, 1998, ANN STAT, V26, P801
[4] Bagging predictors
Breiman, L
[J]. MACHINE LEARNING, 1996, 24 (02) : 123 - 140
[5] Breiman L., 1998, CLASSIFICATION REGRE
[6] SiZer for exploration of structures in curves
Chaudhuri, P
Marron, JS
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1999, 94 (447) : 807 - 823
[7] Classification via kernel product estimators
Cooley, CA
MacEachern, SN
[J]. BIOMETRIKA, 1998, 85 (04) : 823 - 833
[8] NEAREST NEIGHBOR PATTERN CLASSIFICATION
COVER, TM
HART, PE
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) : 21 - +
[9] Dasarathy B.V., 1991, IEEE COMPUTER SOC TU
[10] Devijver P., 1982, PATTERN RECOGN

← 1 2 3 4 →