The importance of the scales in heterogeneous robust clustering

被引:11
作者
Garcia-Escudero, L. A. [1 ]
Gordaliza, A. [1 ]
机构
[1] Univ Valladolid, Dept Estadist & IO, E-47005 Valladolid, Spain
关键词
robustness; cluster analysis; minimum covariance determinant; trimming; scale parameters;
D O I
10.1016/j.csda.2006.06.014
中图分类号
TP39 [计算机的应用];
学科分类号
081203 [计算机应用技术]; 0835 [软件工程];
摘要
The estimation of the scales plays an important role in the derivation of robust cluster techniques. This holds specially for some recently proposed methods including the so-called "concentration" steps in their implementation. A new robust clustering approach is introduced, the SSC method, intended to deal with scales, sizes and contamination. The method starts from a high trimming level which surely serves to remove all the outlying observations. Later, an iterative process is carried out where special attention is paid to the proper estimation of the groups' scales. The estimation of the contamination level is also considered. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:4403 / 4412
页数:10
相关论文
共 25 条
[1]
[Anonymous], 2002, CLASSIFICATION CLUST
[2]
[Anonymous], 1979, Multivariate analysis
[3]
ATKINSON AC, 2004, SPRINGER SERIES STAT
[4]
MODEL-BASED GAUSSIAN AND NON-GAUSSIAN CLUSTERING [J].
BANFIELD, JD ;
RAFTERY, AE .
BIOMETRICS, 1993, 49 (03) :803-821
[5]
Cluster analysis for large datasets: An effective algorithm for maximizing the mixture likelihood [J].
Coleman, DA ;
Woodruff, DL .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2000, 9 (04) :672-688
[6]
Influence function and efficiency of the minimum covariance determinant scatter matrix estimator [J].
Croux, C ;
Haesbroeck, G .
JOURNAL OF MULTIVARIATE ANALYSIS, 1999, 71 (02) :161-190
[7]
Cuesta-Albertos JA, 1997, ANN STAT, V25, P553
[8]
How many clusters? Which clustering method? Answers via model-based cluster analysis [J].
Fraley, C ;
Raftery, AE .
COMPUTER JOURNAL, 1998, 41 (08) :578-588
[9]
A robust method for cluster analysis [J].
Gallegos, MT ;
Ritter, G .
ANNALS OF STATISTICS, 2005, 33 (01) :347-380
[10]
GALLEGOS MT, 2001, ROBUST CLUSTERING GE