An adjusted boxplot for skewed distributions

被引:500
作者
Hubert, M. [1 ]
Vandervieren, E. [2 ]
机构
[1] Katholieke Univ Leuven, Leuven Stat Res Ctr, Dept Math, B-3001 Heverlee, Belgium
[2] Univ Antwerp, Dept Math & Comp Sci, B-2020 Antwerp, Belgium
关键词
D O I
10.1016/j.csda.2007.11.008
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The boxplot is a very popular graphical tool for visualizing the distribution of continuous unimodal data. It shows information about the location, spread, skewness as well as the tails of the data. However. when the data are skewed, usually many points exceed the whiskers and are often erroneously declared as outliers. An adjustment of the boxplot is presented that includes a robust measure of skewness in the determination of the whiskers. This results in a more accurate representation of the data and of possible outliers. Consequently, this adjusted boxplot can also be used as a fast and automatic outlier detection tool without making any parametric assumption about the distribution of the bulk of the data. Several examples and simulation results show the advantages of this new procedure. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:5186 / 5201
页数:16
相关论文
共 23 条
[1]  
Aucremanne L, 2004, STAT IND TECHNOL, P13
[2]  
Bowley A.L., 1920, ELEMENTS STAT
[3]   A robustification of independent component analysis [J].
Brys, G ;
Hubert, M ;
Rousseeuw, PJ .
JOURNAL OF CHEMOMETRICS, 2005, 19 (5-7) :364-375
[4]   A robust measure of skewness [J].
Brys, G ;
Hubert, M ;
Struyf, A .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2004, 13 (04) :996-1017
[5]   Robust measures of tail weight [J].
Brys, G ;
Hubert, M ;
Struyf, A .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (03) :733-759
[6]   Resistant outlier rules and the non-Gaussian case [J].
Carling, K .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2000, 33 (03) :249-258
[7]  
CHAMBERS JM, 1992, STAT MODELS S WADSWO, P348
[8]  
Goegebeur Y., 2005, Journal of Applied Sciences, V5, P1092
[9]  
Hoaglin D.C., 1983, UNDERSTANDING ROBUST, P58
[10]  
HOAGLIN DC, 1985, EXPLORING DATA TABLE, P463