Multiple imputation of missing income data in the National Health Interview Survey

被引:146
作者
Schenker, Nathaniel [1 ]
Raghunathan, Trivellore E.
Chiu, Pei-Lu
Makuc, Diane M.
Zhang, Guangyu
Cohen, Alan J.
机构
[1] Ctr Dis Control & Prevent, Natl Ctr Hlth Stat, Div Hlth Interview Stat, Hyattsville, MD 20782 USA
[2] Univ Michigan, Inst Social Res, Sch Publ Hlth & Res Prof, Ann Arbor, MI 48106 USA
[3] Ctr Dis Control & Prevent, Natl Ctr Hlth Stat, Off Anal & Epidemiol, Hyattsville, MD 20782 USA
[4] Univ Michigan, Dept Biostat, Ann Arbor, MI 48106 USA
基金
美国国家科学基金会;
关键词
health insurance; health status; missing data; poverty; public-use data; sequential regression multivariate imputation;
D O I
10.1198/016214505000001375
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The National Health Interview Survey (NHIS) provides a rich source of data for studying relationships between income and health and for monitoring health and health care for persons at different income levels. However, the nonresponse rates are high for two key items, total family income in the previous calendar year and personal earnings from employment in the previous calendar year. To handle the missing data on family income and personal earnings in the NHIS, multiple imputation of these items, along with employment status and ratio of family income to the federal poverty threshold (derived from the imputed values of family income), has been performed for the survey years 1997-2004. (There are plans to continue this work for years beyond 2004 as well.) Files of the imputed values, as well as documentation, are available at the NHIS website (http://www.cdc.gov/nchs/nhis.htm). This article describes the approach used in the multiple-imputation project and evaluates the methods through analyses of the multiply imputed data. The analyses suggest that imputation corrects for biases that occur in estimates based on the data without imputation, and that multiple imputation results in gains in efficiency as well.
引用
收藏
页码:924 / 933
页数:10
相关论文
共 15 条
[1]  
[Anonymous], 2004 NAT HLTH INT SU
[2]   Small-sample degrees of freedom with multiple imputation [J].
Barnard, J ;
Rubin, DB .
BIOMETRIKA, 1999, 86 (04) :948-955
[3]   AN ANALYSIS OF TRANSFORMATIONS [J].
BOX, GEP ;
COX, DR .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1964, 26 (02) :211-252
[4]  
KENNICKELL AB, 1991, P SURV RES METH SECT, P112
[5]  
LI KH, 1991, STAT SINICA, V1, P65
[6]   LARGE-SAMPLE SIGNIFICANCE LEVELS FROM MULTIPLY IMPUTED DATA USING MOMENT-BASED STATISTICS AND AN F-REFERENCE DISTRIBUTION [J].
LI, KH ;
RAGHUNATHAN, TE ;
RUBIN, DB .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1991, 86 (416) :1065-1073
[7]  
Little Roderick J.A., 2019, Statistical analysis with missing data, V793, DOI DOI 10.1002
[8]   PERFORMING LIKELIHOOD RATIO TESTS WITH MULTIPLY-IMPUTED DATA SETS [J].
MENG, XL ;
RUBIN, DB .
BIOMETRIKA, 1992, 79 (01) :103-111
[9]   MULTIPLE-IMPUTATION INFERENCES WITH UNCONGENIAL SOURCES OF INPUT [J].
MENG, XL .
STATISTICAL SCIENCE, 1994, 9 (04) :538-558
[10]  
Paulin G.D., 1996, J OFF STAT, V12, P403