Computing location depth and regression depth in higher dimensions

被引:126
作者
Rousseeuw, PJ [1 ]
Struyf, A [1 ]
机构
[1] Univ Instelling Antwerp, Dept Math & Comp Sci, B-2610 Wilrijk, Belgium
关键词
algorithms; coordinate-free; multivariate ranking; robustness;
D O I
10.1023/A:1008945009397
中图分类号
TP301 [理论、方法];
学科分类号
081202 [计算机软件与理论];
摘要
The location depth (Tukey 1975) of a point theta relative to a p-dimensional data set Z of size n is defined as the smallest number of data points in a closed halfspace with boundary through theta. For bivariate data? it can be computed in O(n log n) time (Rousseeuw and Ruts 1996). In this paper we construct an exact algorithm to compute the location depth in three dimensions in O(n(2)log n) time. We also give an approximate algorithm to compute the location depth inp dimensions in O(mp(3) + mpn) time, where m is the number of p-subsets used. Recently, Rousseeuw and Hubert (1996) defined the depth of a regression fit. The depth of a hyperplane with coefficients (theta(1),...,theta(p)) is the smallest number of residuals that need to change sign to make (theta(1),...,theta(p)) a nonfit. For bivariate data (p = 2) this depth can be computed in O(n log n) time as well. We construct an algorithm to compute the regression depth of a plane relative to a three-dimensional data set in O(n(2)log n) time, and another that deals with p = 4 in O(n(3) log n) time. For data sets with large n and/or p we propose an approximate algorithm that computes the depth of a regression fit in O(mp(3) + mpn + mn log n) time. For all of these algorithms, actual implementations are made available.
引用
收藏
页码:193 / 203
页数:11
相关论文
共 15 条
[1]
Chambers J. M., 1991, STAT MODELS S
[2]
Daudin J. J., 1988, STAT J THEOR APPL ST, V19, P241
[3]
BREAKDOWN PROPERTIES OF LOCATION ESTIMATES BASED ON HALF-SPACE DEPTH AND PROJECTED OUTLYINGNESS [J].
DONOHO, DL ;
GASKO, M .
ANNALS OF STATISTICS, 1992, 20 (04) :1803-1827
[4]
Eddy W. F., 1985, Computer Science and Statistics. Proceedings of the Sixteenth Symposium on the Interface, P25
[5]
Notions of limiting P values based on data depth and bootstrap [J].
Liu, RY ;
Singh, K .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1997, 92 (437) :266-277
[6]
[7]
A QUALITY INDEX BASED ON DATA DEPTH AND MULTIVARIATE RANK-TESTS [J].
LIU, RY ;
SINGH, K .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (421) :252-260
[8]
HALF-PLANE TRIMMING FOR BIVARIATE DISTRIBUTIONS [J].
MASSE, JC ;
THEODORESCU, R .
JOURNAL OF MULTIVARIATE ANALYSIS, 1994, 48 (02) :188-202
[9]
Rousseeuw P.J., 1996, Journal of the Royal Statistical Society, Series C (Applied Statistics), V45, P516
[10]
ROUSSEEUW PJ, 1996, UNPUB REGRESSION DEP