How many imputations are really needed? - Some practical clarifications of multiple imputation theory

被引:1928
作者
Graham, John W. [1 ]
Olchowski, Allison E. [1 ]
Gilreath, Tamika D. [1 ]
机构
[1] Penn State Univ, Dept Biobehav Hlth, University Pk, PA 16802 USA
关键词
multiple imputation; number of imputations; full information maximum likelihood; missing data; statistical power;
D O I
10.1007/s11121-007-0070-9
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Multiple imputation (MI) and full information maximum likelihood (FIML) are the two most common approaches to missing data analysis. In theory, MI and FIML are equivalent when identical models are tested using the same variables, and when m, the number of imputations performed with MI, approaches infinity. However, it is important to know how many imputations are necessary before MI and FIML are sufficiently equivalent in ways that are important to prevention scientists. MI theory suggests that small values of m, even on the order of three to five imputations, yield excellent results. Previous guidelines for sufficient m are based on relative efficiency, which involves the fraction of missing information (gamma) for the parameter being estimated, and m. In the present study, we used a Monte Carlo simulation to test MI models across several scenarios in which gamma and m were varied. Standard errors and p-values for the regression coefficient of interest varied as a function of m, but not at the same rate as relative efficiency. Most importantly, statistical power for small effect sizes diminished as m became smaller, and the rate of this power falloff was much greater than predicted by changes in relative efficiency. Based our findings, we recommend that researchers using MI should perform many more imputations than previously considered sufficient. These recommendations are based on gamma, and take into consideration one's tolerance for a preventable power falloff (compared to FIML) due to using too few imputations.
引用
收藏
页码:206 / 213
页数:8
相关论文
共 9 条
[1]  
Cohen J., 1988, POWERSTATISTICALSCIE, DOI 10.4324/9780203771587
[2]   A comparison of inclusive and restrictive strategies in modern missing data procedures [J].
Collins, LM ;
Schafer, JL ;
Kam, CM .
PSYCHOLOGICAL METHODS, 2001, 6 (04) :330-351
[3]  
Graham J.W., 2003, RES METHODS PSYCHOL, V2, P87, DOI [10.1002/0471264385.wei0204, DOI 10.1002/0471264385.WEI0204]
[4]   Consequences of not interpreting structure coefficients in published CFA research: A reminder [J].
Graham, JM ;
Guthrie, AC ;
Thompson, B .
STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL, 2003, 10 (01) :142-153
[5]   Analyzing incomplete political science data: An alternative algorithm for multiple imputation [J].
King, G ;
Honaker, J ;
Joseph, A ;
Scheve, K .
AMERICAN POLITICAL SCIENCE REVIEW, 2001, 95 (01) :49-69
[6]  
Rubin D. B., 2004, MULTIPLE IMPUTATION, V81, DOI DOI 10.1002/9780470316696
[7]   Missing data: Our view of the state of the art [J].
Schafer, JL ;
Graham, JW .
PSYCHOLOGICAL METHODS, 2002, 7 (02) :147-177
[8]   Multiple imputation for multivariate missing-data problems: A data analyst's perspective [J].
Schafer, JL ;
Olsen, MK .
MULTIVARIATE BEHAVIORAL RESEARCH, 1998, 33 (04) :545-571
[9]  
Schafer JL., 1997, Analysis of incomplete multivariate data, DOI 10.1201/9781439821862