A bounded influence regression estimator based on the statistics of the hat matrix

被引:63
作者
Chave, AD [1 ]
Thomson, DJ
机构
[1] Woods Hole Oceanog Inst, Dept Appl Ocean Phys & Engn, Deep Submergence Lab, Woods Hole, MA 02543 USA
[2] Queens Univ, Kingston, ON, Canada
关键词
bounded influence estimator; hat matrix; projection matrix; robust regression; time series analysis; transfer function estimation;
D O I
10.1111/1467-9876.00406
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Many geophysical regression problems require the analysis of large (more than 10(4) values) data sets, and, because the data may represent mixtures of concurrent natural processes with widely varying statistical properties, contamination of both response and predictor variables is common. Existing bounded influence or high breakdown point estimators frequently lack the ability to eliminate extremely influential data and/or the computational efficiency to handle large data sets. A new bounded influence estimator is proposed that combines high asymptotic efficiency for normal data, high breakdown point behaviour with contaminated data and computational simplicity for large data sets. The algorithm combines a standard M-estimator to downweight data corresponding to extreme regression residuals and removal of overly influential predictor values (leverage points) on the basis of the statistics of the hat matrix diagonal elements. For this, the exact distribution of the hat matrix diagonal elements p(ii) for complex multivariate Gaussian predictor data is shown to be beta(p(ii), m, N - m), where N is the number of data and m is the number of parameters. Real geophysical data from an auroral zone magnetotelluric study which exhibit severe outlier and leverage point contamination are used to illustrate the estimator's performance. The examples also demonstrate the utility of looking at both the residual and the hat matrix distributions through quantile-quantile plots to diagnose robust regression problems.
引用
收藏
页码:307 / 322
页数:16
相关论文
共 40 条
[1]  
[Anonymous], 2017, Introduction to robust estimation and hypothesis testing
[2]  
Belsley D.A., 1980, Regression Diagnostics: Identifying Influential Data and Sources of Collinearity
[3]  
CALMAN J, 1978, J PHYS OCEANOGR, V8, P627, DOI 10.1175/1520-0485(1978)008<0627:OTIOOC>2.0.CO
[4]  
2
[5]   A NOTE ON ASYMMETRY AND ROBUSTNESS IN LINEAR-REGRESSION [J].
CARROLL, RJ ;
WELSH, AH .
AMERICAN STATISTICIAN, 1988, 42 (04) :285-287
[6]  
Chatterjee S., 1988, Sensitivity Analysis in Linear Regression, DOI 10.1002/9780470316764
[7]   SOME COMMENTS ON MAGNETOTELLURIC RESPONSE FUNCTION ESTIMATION [J].
CHAVE, AD ;
THOMSON, DJ .
JOURNAL OF GEOPHYSICAL RESEARCH-SOLID EARTH AND PLANETS, 1989, 94 (B10) :14215-14225
[8]   ON THE ROBUST ESTIMATION OF POWER SPECTRA, COHERENCES, AND TRANSFER-FUNCTIONS [J].
CHAVE, AD ;
THOMSON, DJ ;
ANDER, ME .
JOURNAL OF GEOPHYSICAL RESEARCH-SOLID EARTH AND PLANETS, 1987, 92 (B1) :633-648
[9]  
CHAVE AD, 2003, IN PRESS GEOPHYS J I
[10]   A BOUNDED INFLUENCE, HIGH BREAKDOWN, EFFICIENT REGRESSION ESTIMATOR [J].
COAKLEY, CW ;
HETTMANSPERGER, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (423) :872-880