From outliers to prototypes:: Ordering data

被引:64
作者
Harmeling, Stefan
Dornhege, Guido
Tax, David
Meinecke, Frank
Mueller, Klaus-Robert
机构
[1] Fraunhofer FIRST IDA, D-12489 Berlin, Germany
[2] Univ Potsdam, Dept Comp Sci, D-14482 Potsdam, Germany
[3] Delft Univ Technol, Informat & Commun Theory Grp, NL-2600 GA Delft, Netherlands
关键词
outlier detection; novelty detection; ordering; noisy dimensionality reduction; clustering; nearest neighbors;
D O I
10.1016/j.neucom.2005.05.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose simple and fast methods based on nearest neighbors that order objects from high-dimensional data sets from typical points to untypical points. On the one hand, we show that these easy-to-compute orderings allow us to detect outliers (i.e. very untypical points) with a performance comparable to or better than other often much more sophisticated methods. On the other hand, we show how to use these orderings to detect prototypes (very typical points) which facilitate exploratory data analysis algorithms such as noisy nonlinear dimensionality reduction and clustering. Comprehensive experiments demonstrate the validity of our approach. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:1608 / 1618
页数:11
相关论文
共 40 条
[1]  
[Anonymous], 1989, INTRO ALGORITHMS
[2]  
[Anonymous], J MACHINE LEARNING R
[3]  
[Anonymous], UCI REPOSITORY MACHI
[4]   Set estimation and nonparametric detection [J].
Baíllo, A ;
Cuevas, A ;
Justel, A .
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2000, 28 (04) :765-782
[5]  
BARNETT V, 1978, WILEY SERIES PROBABI
[6]   NOVELTY DETECTION AND NEURAL-NETWORK VALIDATION [J].
BISHOP, CM .
IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1994, 141 (04) :217-222
[7]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[8]  
Breunig M.M., 2000, ACM SIGMOD INT C MAN
[9]  
Campbell C, 2001, ADV NEUR IN, V13, P395
[10]   Neural-network classifiers for recognizing totally unconstrained handwritten numerals [J].
Cho, SB .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (01) :43-53