Fast and robust discriminant analysis

被引:127
作者
Hubert, M
Van Driessen, K
机构
[1] Katholieke Univ Leuven, Dept Math, B-3001 Heverlee, Belgium
[2] Univ Antwerp, RUCA, Fac Appl Econ, UFSIA, B-2020 Antwerp, Belgium
关键词
classification; discriminant analysis; MCD estimator; robust statistics;
D O I
10.1016/S0167-9473(02)00299-2
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The goal of discriminant analysis is to obtain rules that describe the separation between groups of observations. Moreover it allows to classify new observations into one of the known groups. In the classical approach discriminant rules are often based on the empirical mean and covariance matrix of the data, or of parts of the data. But because these estimates are highly influenced by outlying observations, they become inappropriate at contaminated data sets. Robust discriminant rules are obtained by inserting robust estimates of location and scatter into generalized maximum likelihood rules at normal distributions. This approach allows to discriminate between several populations, with equal or unequal covariance structure, and with equal or unequal membership probabilities. In particular, the highly robust MCD estimator is used as it can be computed very fast for large data sets. Also the probability of misclassification is estimated in a robust way. The performance of the new method is investigated through several simulations and by applying it to some real data sets. (C) 2003 Elsevier B.V. All rights reserved.
引用
收藏
页码:301 / 320
页数:20
相关论文
共 19 条