Identification of outliers in multivariate data

被引:235
作者
Rocke, DM
Woodruff, DL
机构
关键词
heuristic search; M estimation; minimum covariance determinant; S estimation;
D O I
10.2307/2291724
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
New insights are given into why the problem of detecting multivariate outliers can be difficult and why the difficulty increases with the dimension of the data. Significant improvements in methods for detecting outliers are described, and extensive simulation experiments demonstrate that a hybrid method extends the practical boundaries of outlier detection capabilities. Based on simulation results and examples from the literature, the question of what levels of contamination can be detected by this algorithm as a function of dimension, computation time, sample size, contamination fraction, and distance of the contamination from the main body of data is investigated. Software to implement the methods is available from the authors and STATLIB.
引用
收藏
页码:1047 / 1061
页数:15
相关论文
共 43 条
[1]  
Andrews DF., 1972, ROBUST ESTIMATES LOC
[2]  
[Anonymous], STAT NEERLANDICA, DOI DOI 10.1111/J.1467-9574.1993.TB01404.X
[3]  
[Anonymous], DIRECTIONS ROBUST 2
[4]  
[Anonymous], J COMPUTATIONAL GRAP
[5]  
ATKINSON A, 1993, DATA ANAL ROBUSTNESS
[6]   FAST VERY ROBUST METHODS FOR THE DETECTION OF MULTIPLE OUTLIERS [J].
ATKINSON, AC .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1994, 89 (428) :1329-1339
[7]   THE STALACTITE PLOT FOR THE DETECTION OF MULTIVARIATE OUTLIERS [J].
ATKINSON, AC ;
MULIRA, HM .
STATISTICS AND COMPUTING, 1993, 3 (01) :27-35
[8]   ASYMPTOTICS FOR THE MINIMUM COVARIANCE DETERMINANT ESTIMATOR [J].
BUTLER, RW ;
DAVIES, PL ;
JHUN, M .
ANNALS OF STATISTICS, 1993, 21 (03) :1385-1400
[9]  
Campbell N. A., 1980, Applied Statistics, V29, P231, DOI 10.2307/2346896
[10]  
Campbell N.A, 1989, BUSHFIRE MAPPING USI