Robustness properties of k means and trimmed k means

被引:126
作者
García-Escudero, LA [1 ]
Gordaliza, A [1 ]
机构
[1] Univ Valladolid, Fac Ciencias, Dept Estadistica & Invest Operat, E-47002 Valladolid, Spain
关键词
breakdown point; cluster analysis; influence function; qualitative robustness;
D O I
10.2307/2670010
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The generalized k means method is based on the minimization of the discrepancy between a random variable (or a sample of this random variable) and a set with ii points measured through a penalty function Phi. As in the M estimators setting (k = 1), a penalty function, Phi, with unbounded derivative, Psi, naturally leads to nonrobust generalized k means. However, surprisingly the lack of robustness extends also to the case of bounded Psi; that is, generalized k means do not inherit the robustness properties of the M estimator from which they came. Attempting to robustify the generalized k means method, the generalized trimmed ic means method arises from combining fi means idea with a so-called impartial trimming procedure. In this article study generalized k means and generalized trimmed k means performance from the viewpoint of Hampel's robustness criteria; that is, we investigate the influence function, breakdown point, and qualitative robustness, confirming the superiority provided by the trimming. We include the study of two real datasets to make clear the robustness of generalized trimmed k means.
引用
收藏
页码:956 / 969
页数:14
相关论文
共 29 条