Outliers in multilevel data

被引:87
作者
Langford, IH [1 ]
Lewis, T
机构
[1] Univ E Anglia, Ctr Social & Econ Res Global Environm, Sch Environm Sci, Norwich NR4 7TJ, Norfolk, England
[2] Univ London, Inst Educ, London WC1N 1AZ, England
关键词
cluster analysis; hierarchical data; influential data points; leverage; multilevel modelling; outlier detection; reduction in deviance; studentized residuals;
D O I
10.1111/1467-985X.00094
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
This paper offers the data analyst a range of practical procedures for dealing with outliers in multilevel data. It first develops several techniques for data exploration for outliers and outlier analysis and then applies these to the detailed analysis of outliers in two large scale multilevel data sets from educational contexts. The techniques include the use of deviance reduction, measures based on residuals, leverage values, hierarchical cluster analysis and a measure called DFITS. Outlier analysis is more complex in a multilevel data set than in, say, a univariate sample or a set of regression data, where the concept of an outlying value is straightforward. In the multilevel situation one has to consider, for example, at what level o(-) levels a particular response is ou!lying, and in respect of which explanatory variables; furthermore, the treatment of a particular response at one level may affect its status or the status of other units at other levels in the model.
引用
收藏
页码:121 / 153
页数:33
相关论文
共 28 条
[1]   STATISTICAL MODELING ISSUES IN SCHOOL EFFECTIVENESS STUDIES [J].
AITKIN, M ;
LONGFORD, N ;
PLEWIS, IF ;
WAKEFIELD, WB ;
CHATFIELD, C ;
GOLDSTEIN, H ;
REYNOLDS, D ;
COX, D ;
ECOB, R ;
GRAY, J ;
BELL, JF ;
BURSTEIN, L ;
DAWID, AP ;
HEALY, MJR ;
HUTCHISON, DA ;
KILGORE, S ;
PENDLETON, WW ;
LAIRD, NM ;
LOUIS, TA ;
PRAIS, SJ ;
RUTTER, M ;
MAUGHAN, B ;
OUSTON, J ;
SHARE, DL ;
SMITH, TMF .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1986, 149 :1-43
[2]  
ATKINSON AC, 1986, BIOMETRIKA, V73, P533
[3]   ORDERING OF MULTIVARIATE DATA [J].
BARNETT, V .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1976, 139 :318-354
[4]  
Barnett V., 1984, Outliers in Statistical Data
[5]  
BECKMAN RJ, 1983, TECHNOMETRICS, V25, P119, DOI 10.2307/1268541
[6]  
Belsley D.A., 1980, Regression Diagnostics: Identifying Influential Data and Sources of Collinearity
[7]   APPROXIMATE INFERENCE IN GENERALIZED LINEAR MIXED MODELS [J].
BRESLOW, NE ;
CLAYTON, DG .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (421) :9-25
[8]  
Bryk A.S., 1992, Hierarchical Models: Applications and Data Analysis Methods
[9]  
Chatterjee S., 1988, Sensitivity Analysis in Linear Regression, DOI 10.1002/9780470316764
[10]  
Chatterjee S., 1986, STAT SCI, V1, P379, DOI DOI 10.1214/SS/1177013622