Biased bootstrap methods for reducing the effects of contamination

被引:24
作者
Hall, P [1 ]
Presnell, B
机构
[1] Australian Natl Univ, Ctr Math & Its Applicat, Canberra, ACT 0200, Australia
[2] CSIRO, Sydney, NSW 2070, Australia
[3] Univ Florida, Gainesville, FL USA
关键词
biased bootstrap; empirical likelihood; influence; inlier; local linear smoothing; multivariate analysis; nonparametric curve estimation; outlier; regression; robust statistical methods; trimming; weighted bootstrap;
D O I
10.1111/1467-9868.00199
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Contamination of a sampled distribution, for example by a heavy-tailed distribution, can degrade the performance of a statistical estimator. We suggest a general approach to alleviating this problem, using a version of the weighted bootstrap. The idea is to 'tilt' away from the contaminated distribution by a given (but arbitrary) amount, in a direction that minimizes a measure of the new distribution's dispersion. This theoretical proposal has a simple empirical version, which results in each data value being assigned a weight according to an assessment of its influence on dispersion. Importantly, distance can be measured directly in terms of the likely level of contamination, without reference to an empirical measure of scale. This makes the procedure particularly attractive for use in multivariate problems. It has several forms, depending on the definitions taken for dispersion and for distance between distributions. Examples of dispersion measures include variance and generalizations based on high order moments. Practicable measures of the distance between distributions may be based on power divergence, which includes Hellinger and Kullback-Leibler distances. The resulting location estimator has a smooth, redescending influence curve and appears to avoid computational difficulties that are typically associated with redescending estimators. Its breakdown point can be located at any desired value epsilon is an element of (0, 1/2) simply by 'trimming' to a known distance (depending only on epsilon and the choice of distance measure) from the empirical distribution. The estimator has an affine equivariant multivariate form. Further, the general method is applicable to a range of statistical problems, including regression.
引用
收藏
页码:661 / 680
页数:20
相关论文
共 23 条
[1]   Empirical likelihood as a goodness-of-fit measure [J].
Baggerly, KA .
BIOMETRIKA, 1998, 85 (03) :535-547
[2]  
Barbe P., 1995, LECT NOTES STAT, V98
[3]   Robust and efficient estimation by minimising a density power divergence [J].
Basu, A ;
Harris, IR ;
Hjort, NL ;
Jones, MC .
BIOMETRIKA, 1998, 85 (03) :549-559
[4]   MINIMUM HELLINGER DISTANCE ESTIMATES FOR PARAMETRIC MODELS [J].
BERAN, R .
ANNALS OF STATISTICS, 1977, 5 (03) :445-463
[5]   Bartlett adjustment of empirical discrepancy statistics [J].
Corcoran, SA .
BIOMETRIKA, 1998, 85 (04) :967-972
[6]   NONPARAMETRIC CONFIDENCE-LIMITS BY RESAMPLING METHODS AND LEAST FAVORABLE FAMILIES [J].
DICICCIO, TJ ;
ROMANO, JP .
INTERNATIONAL STATISTICAL REVIEW, 1990, 58 (01) :59-76
[7]  
Efron B., 1981, The Canadian Journal of Statistics / La Revue Canadienne de Statistique, V9, P139, DOI 10.2307/3314608
[8]   Intentionally biased bootstrap methods [J].
Hall, P ;
Presnell, B .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1999, 61 :143-158
[9]   ON THE ERROR INCURRED USING THE BOOTSTRAP VARIANCE ESTIMATE WHEN CONSTRUCTING CONFIDENCE-INTERVALS FOR QUANTILES [J].
HALL, P ;
MARTIN, MA .
JOURNAL OF MULTIVARIATE ANALYSIS, 1991, 38 (01) :70-81
[10]  
Hampel F. R., 1986, ROBUST STAT APPROACH