Robust clusterwise linear regression through trimming

被引:57
作者
Garcia-Escudero, L. A. [1 ]
Gordaliza, A. [1 ]
Mayo-Iscar, A. [1 ]
San Martin, R. [1 ]
机构
[1] Univ Valladolid, Fac Ciencias, Dept Estadist & Invest Operativa, E-47005 Valladolid, Spain
关键词
ALGORITHM; ESTIMATOR; MIXTURES; OUTLIERS;
D O I
10.1016/j.csda.2009.07.002
中图分类号
TP39 [计算机的应用];
学科分类号
080201 [机械制造及其自动化];
摘要
The presence of clusters in a data set is sometimes due to the existence of certain relations among the measured variables which vary depending on some hidden factors. In these cases, observations could be grouped in a natural way around linear and nonlinear structures and, thus, the problem of doing robust clustering around linear affine subspaces has recently been tackled through the minimization of a trimmed sum of orthogonal residuals. This "orthogonal approach" implies that there is no privileged variable playing the role of response variable or output. However, there are problems where clearly one variable is wanted to be explained in terms of the other ones and the use of vertical residuals from classical linear regression seems to be more advisable. The so-called TCLUST methodology is extended to perform robust clusterwise linear regression and a feasible algorithm for the practical implementation is proposed. The algorithm includes a "second trimming" step aimed to diminishing the effect of leverage points. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:3057 / 3069
页数:13
相关论文
共 27 条
[1]
[Anonymous], 1990, Classical and modern regression with applications
[2]
ASYMPTOTIC-BEHAVIOR OF CLASSIFICATION MAXIMUM LIKELIHOOD ESTIMATES [J].
BRYANT, P ;
WILLIAMSON, JA .
BIOMETRIKA, 1978, 65 (02) :273-281
[3]
A CLASSIFICATION EM ALGORITHM FOR CLUSTERING AND 2 STOCHASTIC VERSIONS [J].
CELEUX, G ;
GOVAERT, G .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1992, 14 (03) :315-332
[4]
Multiple model regression estimation [J].
Cherkassky, V ;
Ma, YQ .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2005, 16 (04) :785-798
[5]
Cuesta-Albertos JA, 1997, ANN STAT, V25, P553
[6]
[7]
A robust method for cluster analysis [J].
Gallegos, MT ;
Ritter, G .
ANNALS OF STATISTICS, 2005, 33 (01) :347-380
[8]
Gallegos MT, 2002, CLASSIFICATION CLUST, P247
[9]
Robust linear clustering [J].
Garcia-Escudero, L. A. ;
Gordaliza, A. ;
San Martin, R. ;
Van Aelst, S. ;
Zamar, R. .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2009, 71 :301-318
[10]
A general trimming approach to robust cluster analysis [J].
Garcia-Escudero, Luis A. ;
Gordaliza, Alfonso ;
Matran, Carlos ;
Mayo-Iscar, Agustin .
ANNALS OF STATISTICS, 2008, 36 (03) :1324-1345