Bias due to missing exposure data using complete-case analysis in the proportional hazards regression model

被引:63
作者
Demissie, S
LaValley, MP
Horton, NJ
Glynn, RJ
Cupples, LA
机构
[1] Boston Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02118 USA
[2] Boston Univ, Sch Med, Dept Med, Boston, MA 02118 USA
[3] Harvard Med Sch, Dept Med, Boston, MA USA
关键词
complete case analysis; missing covariate; bias; hazard ratio;
D O I
10.1002/sim.1340
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We studied bias due to missing exposure data in the proportional hazards regression model when using complete-case analysis (CCA). Eleven missing data scenarios were considered: one with missing completely at random (MCAR), four missing at random (MAR), and six non-ignorable missingness scenarios, with a variety of hazard ratios, censoring fractions, missingness fractions and sample sizes. When missingness was MCAR or dependent only on the exposure, there was negligible bias (2-3 per cent) that was similar to the difference between the estimate in the full data set with no missing data and the true parameter. In contrast, substantial bias occurred when missingness was dependent on outcome or both outcome and exposure. For models with hazard ratio of 3.5, a sample size of 400, 20 per cent censoring and 40 per cent missing data, the relative bias for the hazard ratio ranged between 7 per cent and 64 per cent. We observed important differences in the direction and magnitude of biases under the various missing data mechanisms. For example, in scenarios where missingness was associated with longer or shorter follow-up, the biases were notably different, although both mechanisms are MAR. The hazard ratio was underestimated (with larger bias) when missingness was associated with longer follow-up and overestimated (with smaller bias) when associated with shorter follow-up. If it is known that missingness is associated with a less frequently observed outcome or with both the outcome and exposure, CCA may result in an invalid inference and other methods for handling missing data should be considered. Copyright (C) 2003 John Wiley Sons, Ltd.
引用
收藏
页码:545 / 557
页数:13
相关论文
共 23 条
[1]  
Allison PD, 1995, Survival analysis using sas: A practical guide, V2nd
[2]   Analysis of case-cohort designs [J].
Barlow, WE ;
Ichikawa, L ;
Rosner, D ;
Izumi, S .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 1999, 52 (12) :1165-1172
[3]   FITTING COX PROPORTIONAL HAZARDS MODELS FROM SURVEY DATA [J].
BINDER, DA .
BIOMETRIKA, 1992, 79 (01) :139-147
[4]  
Casella G., 2021, STAT INFERENCE
[5]   Proportional hazards regression with missing covariates [J].
Chen, HY ;
Little, RJA .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1999, 94 (447) :896-908
[6]  
COX DR, 1972, J R STAT SOC B, V34, P187
[7]   Likelihood-based methods for missing covariates in the Cox proportional hazards model [J].
Herring, AH ;
Ibrahim, JG .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (453) :292-302
[8]   INCOMPLETE DATA IN GENERALIZED LINEAR-MODELS [J].
IBRAHIM, JG .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1990, 85 (411) :765-769
[9]   Indicator and stratification methods for missing explanatory variables in multiple linear regression [J].
Jones, MP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1996, 91 (433) :222-230
[10]   INVESTIGATION OF CORONARY HEART-DISEASE IN FAMILIES - FRAMINGHAM OFFSPRING STUDY [J].
KANNEL, WB ;
FEINLEIB, M ;
MCNAMARA, PM ;
GARRISON, RJ ;
CASTELLI, WP .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 1979, 110 (03) :281-290