Outlier detection for skewed data

被引:193
作者
Hubert, Mia [1 ]
Van der Veeken, Stephan [1 ]
机构
[1] Katholieke Univ Leuven, Dept Math, LSTAT, B-3001 Louvain, Belgium
关键词
outlier detection; boxplot; bagplot; skewness; outlyingness;
D O I
10.1002/cem.1123
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most outlier detection rules for multivariate data are based on the assumption of elliptical symmetry of the underlying distribution. We propose an outlier detection method which does not need the assumption of symmetry and does not rely on visual inspection. Our method is a generalization of the Stahel-Donoho outlyingness. The latter approach assigns to each observation a measure of outlyingness, which is obtained by projection pursuit techniques that only use univariate robust measures of location and scale. To allow skewness in the data, we adjust this measure of outlyingness by using a robust measure of skewness as well. The observations corresponding to an outlying value of the adjusted outlyingness (AO) are then considered as outliers. For bivariate data, our approach leads to two graphical representations. The first one is a contour plot of the AO values. We also construct an extension of the boxplot for bivariate data, in the spirit of the bagplot [1] which is based on the concept of half space depth. We illustrate our outlier detection method on several simulated and real data. Copyright (c) 2008 John Wiley & Sons, Ltd.
引用
收藏
页码:235 / 246
页数:12
相关论文
共 21 条