IMPUTATION TECHNIQUES IN REGRESSION-ANALYSIS - LOOKING CLOSELY AT THEIR IMPLEMENTATION

被引:55
作者
BELLO, AL [1 ]
机构
[1] UNIV OXFORD,DEPT STAT,OXFORD OX1 3TG,ENGLAND
关键词
MISSING VALUES; REGRESSION ANALYSIS; EM ALGORITHM; PRINCIPAL COMPONENT; SINGULAR VALUE DECOMPOSITION;
D O I
10.1016/0167-9473(94)00024-D
中图分类号
TP39 [计算机的应用];
学科分类号
081203 [计算机应用技术]; 0835 [软件工程];
摘要
A problem which frequently arises in regression analysis is the presence of missing values on the explanatory variables. Imputation is a time-honoured approach to tackling it, since graphical exploration of properties of a statistical model requires a complete data matrix. This article examines the performance of five imputation techniques in two frequently used implementation procedures. Specifically, imputed values based on both the response and explanatory variables (type I) are contrasted with those based on only the explanatory variables (type II). Monte Carlo results indicate that imputed values with type I procedure may give spurious impression of high precision especially as the proportion of missing data increases. But with type II, overestimation of residual mean square error may arise. Several matrices of correlation coefficients are used and an illustrative real data example is given.
引用
收藏
页码:45 / 57
页数:13
相关论文
共 18 条
[1]
MISSING OBSERVATIONS IN MULTIVARIATE STATISTICS .1. IEW OF LITERATURE [J].
AFIFI, AA ;
ELASHOFF, RM .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1966, 61 (315) :595-&
[2]
Atkinson A. C, 1985, PLOTS TRANSFORMATION
[3]
BEALE EML, 1975, J ROY STAT SOC B MET, V37, P129
[4]
CHOOSING AMONG IMPUTATION TECHNIQUES FOR INCOMPLETE MULTIVARIATE DATA - A SIMULATION STUDY [J].
BELLO, AL .
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1993, 22 (03) :853-877
[5]
TREATMENT OF MISSING VALUES IN DISCRIMINANT ANALYSIS .1. SAMPLING EXPERIMENT [J].
CHAN, LS ;
DUNN, OJ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1972, 67 (33) :473-&
[6]
DEAR SE, 1959, SP86 SYST DEV CORP R
[7]
MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[8]
TUTORIAL ON THE SWEEP OPERATOR [J].
GOODNIGHT, JH .
AMERICAN STATISTICIAN, 1979, 33 (03) :149-158
[9]
HAITOVSKY Y, 1968, J R STAT SOC B, V30, P67
[10]
HAMILTON MA, 1975, 1375 MONT STAT U DEP