Outlier detection in multivariate analytical chemical data

被引:129
作者
Egan, WJ [1 ]
Mogan, SL [1 ]
机构
[1] Univ S Carolina, Dept Chem & Biochem, Columbia, SC 29208 USA
关键词
D O I
10.1021/ac970763d
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The unreliability of multivariate outlier detection techniques such as Mahalanobis distance and hat matrix leverage has been known in the statistical community for well over a decade. However, only within the past few years has a serious effort been made to introduce robust methods for the detection of multivariate outliers into the chemical literature. Techniques such as the minimum volume ellipsoid (MVE), multivariate trimming (MVT), and ill-estimators (e,g,, PROP), and others similar to them, such as the minimum covariance determinant (MCD), rely upon algorithms that are difficult to program and may require significant processing times. While MCD and MVE have been shown to be statistically sound, we found MVT unreliable due to the method's use of the Mahalanobis distance measure in its initial step. We examined the performance of MCD and MVT on selected data sets and in simulations and compared the results with two methods of our own devising. Both the proposed resampling by the half-means method and the smallest half-volume method are simple to use, are conceptually clear, and provide results superior to MVT and the current best-performing technique, MCD. Either proposed method is recommended for the detection of multiple outliers in multivariate data.
引用
收藏
页码:2372 / 2379
页数:8
相关论文
共 51 条
[2]  
ATKINSON AC, 1986, BIOMETRIKA, V73, P533
[3]   FAST VERY ROBUST METHODS FOR THE DETECTION OF MULTIPLE OUTLIERS [J].
ATKINSON, AC .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1994, 89 (428) :1329-1339
[4]  
BARNETT V, 1994, OUTLIERS STAT DATA, P7
[5]  
Campbell N. A., 1980, Applied Statistics, V29, P231, DOI 10.2307/2346896
[6]   PATTERN-RECOGNITION ANALYSIS OF NEAR-INFRARED SPECTRA BY ROBUST DISTANCE METHOD [J].
CHO, JH ;
GEMPERLINE, PJ .
JOURNAL OF CHEMOMETRICS, 1995, 9 (03) :169-178
[7]  
Cornell JA, 2002, EXPT MIXTURES DESIGN
[8]   ROBUST ESTIMATION OF DISPERSION MATRICES AND PRINCIPAL COMPONENTS [J].
DEVLIN, SJ ;
GNANADESIKAN, R ;
KETTENRING, JR .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1981, 76 (374) :354-362
[9]   ANALYSIS OF EXTREME VALUES [J].
DIXON, WJ .
ANNALS OF MATHEMATICAL STATISTICS, 1950, 21 (04) :488-506
[10]   BREAKDOWN PROPERTIES OF LOCATION ESTIMATES BASED ON HALF-SPACE DEPTH AND PROJECTED OUTLYINGNESS [J].
DONOHO, DL ;
GASKO, M .
ANNALS OF STATISTICS, 1992, 20 (04) :1803-1827