Cluster analysis: a further approach based on density estimation

被引:73
作者
Cuevas, A
Febrero, M
Fraiman, R
机构
[1] Univ Autonoma Madrid, Dept Math, E-28049 Madrid, Spain
[2] Univ Santiago de Compostela, Dept Stat, Santiago De Compostela 15771, Spain
[3] Univ San Andres, Dept Math, Buenos Aires, DF, Argentina
关键词
cluster algorithms; density estimates; smoothed bootstrap; level set estimation;
D O I
10.1016/S0167-9473(00)00052-9
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
A cluster methodology, motivated via density estimation, is proposed. It is based on the idea of estimating the population clusters, which, following Hartigan (1975), are defined as the connected parts of the "substantial" support of the underlying density. The empirical clusters are defined by analogy in terms of the substantial support of a convolution (kemel-type) density estimator. The sample observations are grouped into data clusters, according to the empirical cluster they belong. An algorithm to implement the method, based on resampling ideas, is proposed. It allows either to automatically choose the number of clusters or to give this number as an input. Some theoretical and practical aspects are briefly discussed and a simulation study is given. The results show a good performance of our method, in terms of efficiency and robustness, when compared with two classical cluster algorithms: k-means and single linkage. Finally, a real-data example is discussed. (C) 2001 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:441 / 459
页数:19
相关论文
共 29 条
[1]  
Anderberg M. R., 1973, CLUSTER ANAL APPL, DOI DOI 10.1016/C2013-0-06161-0
[2]   A COMPARATIVE-STUDY OF SEVERAL SMOOTHING METHODS IN DENSITY-ESTIMATION [J].
CAO, R ;
CUEVAS, A ;
MANTEIGA, WG .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1994, 17 (02) :153-176
[3]  
Cuesta-Albertos JA, 1997, ANN STAT, V25, P553
[4]   Estimating the number of clusters [J].
Cuevas, A ;
Febrero, M ;
Fraiman, R .
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2000, 28 (02) :367-382
[5]  
Cuevas A, 1997, ANN STAT, V25, P2300
[6]   ON PATTERN-ANALYSIS IN THE NONCONVEX CASE [J].
CUEVAS, A .
KYBERNETES, 1990, 19 (06) :26-33
[7]   DETECTION OF ABNORMAL-BEHAVIOR VIA NONPARAMETRIC-ESTIMATION OF THE SUPPORT [J].
DEVROYE, L ;
WISE, GL .
SIAM JOURNAL ON APPLIED MATHEMATICS, 1980, 38 (03) :480-488
[8]  
Devroye L, 1997, TEST-SPAIN, V6, P223
[9]  
DEVROYE L, 1985, NONPARAMETRIC DENSIR
[10]  
Efron B., 1993, INTRO BOOTSTRAP, DOI 10.1007/978-1-4899-4541-9