CHOOSING AMONG IMPUTATION TECHNIQUES FOR INCOMPLETE MULTIVARIATE DATA - A SIMULATION STUDY

被引:12
作者
BELLO, AL [1 ]
机构
[1] UNIV OXFORD,DEPT STAT,OXFORD OX1 3TG,ENGLAND
关键词
IMPUTATION TECHNIQUES; IMPUTED DATA MATRIX; EM ALGORITHM; PRINCIPAL COMPONENT ANALYSIS; SINGULAR VALUE DECOMPOSITION;
D O I
10.1080/03610929308831061
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A wide variety of strategies for coping with the problem of missing values, which frequently arises in multivariate data, have been proposed and tried over the years. One popular and important strategy is to estimate the missing values themselves in some way, usually achieved by imputation techniques. By means of Monte Carlo simulations, this paper investigates the relative performance of five deterministic imputation techniques using normal and non-normal data with several factors that may affect their efficiency. The imputation techniques are: mean substitution method (MSM), EM algorithm (EM), Dear's principal component method (DPC), general iterative principal component method (GIP) and singular value decomposition method (SVD). GIP is a refined, iterative version of DPC, developed to overcome certain problems with the latter. Although results indicate that no single imputation technique is best overall in all combinations of factors studied, MSM and DPC behave erratically; when the intercorrelation among the variables is moderate or high, they performed worse than the iterative imputation techniques-EM, SVD, and GIP-which, under this condition, are equally efficient. An illustrative real data example is given.
引用
收藏
页码:853 / 877
页数:25
相关论文
共 34 条
[1]   MISSING OBSERVATIONS IN MULTIVARIATE STATISTICS .1. IEW OF LITERATURE [J].
AFIFI, AA ;
ELASHOFF, RM .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1966, 61 (315) :595-&
[2]  
BEALE EML, 1975, J ROY STAT SOC B MET, V37, P129
[3]   POPULATION CORRELATION MATRICES FOR SAMPLING EXPERIMENTS [J].
BENDEL, RB ;
MICKEY, MR .
COMMUNICATIONS IN STATISTICS PART B-SIMULATION AND COMPUTATION, 1978, 7 (02) :163-182
[4]  
BERGER JO, 1980, STATISTICAL DECISION
[5]  
BOYLES RA, 1983, J ROY STAT SOC B MET, V45, P47
[6]  
BRYCE GR, 1979, SD015R BRIGH YOUNG U
[7]   RANK-ONE MODIFICATION OF SYMMETRIC EIGENPROBLEM [J].
BUNCH, JR ;
NIELSEN, CP ;
SORENSEN, DC .
NUMERISCHE MATHEMATIK, 1978, 31 (01) :31-48
[8]   UPDATING SINGULAR VALUE DECOMPOSITION [J].
BUNCH, JR ;
NIELSEN, CP .
NUMERISCHE MATHEMATIK, 1978, 31 (02) :111-129
[9]   TREATMENT OF MISSING VALUES IN DISCRIMINANT ANALYSIS .1. SAMPLING EXPERIMENT [J].
CHAN, LS ;
DUNN, OJ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1972, 67 (33) :473-&
[10]  
DEAR RE, 1959, SP86 SYST DEV CORP R