Multiple imputation: Review of theory, implementation and software

被引:303
作者
Harel, Ofer
Zhou, Xiao-Hua
机构
[1] Univ Connecticut, Dept Stat, Storrs, CT 06269 USA
[2] VA Puget Sound Hlth Care Syst, HSR&D Ctr Excellence, Seattle, WA 98108 USA
[3] Univ Washington, Sch Publ Hlth, Dept Biostat, Seattle, WA 98195 USA
关键词
multiple imputation; sensitivity and specificity; diagnostic tests; PATTERN-MIXTURE MODELS; DATA AUGMENTATION; IMPUTED DATA; MISSING DATA; POSTERIOR DISTRIBUTIONS; INCOMPLETE DATA; DROP-OUT; TESTS; VERIFICATION; ESTIMATORS;
D O I
10.1002/sim.2787
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Missing data is a common complication in data analysis. In many medical settings missing data can cause difficulties in estimation, precision and inference. Multiple imputation (MI) (Multiple Imputation for Nonresponse in Surveys. Wiley: New York, 1987) is a simulation-based approach to deal with incomplete data. Although there are many different methods to deal with incomplete data, MI has become one of the leading methods. Since the late 1980s we observed a constant increase in the use and publication of MI-related research. This tutorial does not attempt to cover all the material concerning MI, but rather provides an overview and combines together the theory behind MI, the implementation of MI, and discusses increasing possibilities of the use of MI using commercial and free software. We illustrate some of the major points using an example from an Alzheimer disease (AD) study. In this AD study, while clinical data are available for all subjects, postmortem data are only available for the subset of those who died and underwent an autopsy. Analysis of incomplete data requires making unverifiable assumptions. These assumptions are discussed in detail in the text. Relevant S-Plus code is provided. Copyright (C) 2007 John Wiley & Sons, Ltd.
引用
收藏
页码:3057 / 3077
页数:21
相关论文
共 60 条
[51]  
SCHAFER JL, 1999, NORM MULTIPLE IMPUTA, V2
[52]  
Schafer JL., 1997, ANAL INCOMPLETE MULT, DOI DOI 10.1201/9780367803025
[53]   ASYMPTOTIC RESULTS FOR MULTIPLE IMPUTATION [J].
SCHENKER, N ;
WELSH, AH .
ANNALS OF STATISTICS, 1988, 16 (04) :1550-1566
[54]  
Schimert J., 2001, Analyzing Missing Values in S-PLUS
[55]   THE CALCULATION OF POSTERIOR DISTRIBUTIONS BY DATA AUGMENTATION [J].
TANNER, MA ;
WING, HW .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1987, 82 (398) :528-540
[56]  
Van Buuren S, 1999, STAT MED, V18, P681, DOI 10.1002/(SICI)1097-0258(19990330)18:6<681::AID-SIM71>3.0.CO
[57]  
2-R
[58]  
van Buuren S., 1999, FLEXIBLE MULTIVARIAT
[59]   Multiple imputation in public health research [J].
Zhou, XH ;
Eckert, GJ ;
Tierney, WM .
STATISTICS IN MEDICINE, 2001, 20 (9-10) :1541-1549
[60]  
[No title captured]