Regression analysis of longitudinal binary data with time-dependent environmental covariates: bias and efficiency

被引:46
作者
Schildcrout, JS
Heagerty, PJ
机构
[1] Vanderbilt Univ, Dept Biostat, Nashville, TN 37232 USA
[2] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
关键词
ALR; bias-variance trade-off; GEE; longitudinal data; marginal model;
D O I
10.1093/biostatistics/kxi033
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Generalized estimating equations (Liang and Zeger, 1986) is a widely used, moment-based procedure to estimate marginal regression parameters. However, a subtle and often overlooked point is that valid inference requires the mean for the response at time t to be expressed properly as a function of the complete past, present, and future values of any time-varying covariate. For example, with environmental exposures it may be necessary to express the response as a function of multiple lagged values of the covariate series. Despite the fact that multiple lagged covariates may be predictive of outcomes, researchers often focus interest on parameters in a 'cross-sectional' model, where the response is expressed as a function of a single lag in the covariate series. Cross-sectional models yield parameters with simple interpretations and avoid issues of collinearity associated with multiple lagged values of a covariate. Pepe and Anderson (1994), showed that parameter estimates for time-varying covariates may be biased unless the mean, given all past, present, and future covariate values, is equal to the cross-sectional mean or unless independence estimating equations are used. Although working independence avoids potential bias, many authors have shown that a poor choice for the response correlation model can lead to highly inefficient parameter estimates. The purpose of this paper is to study the bias-efficiency trade-off associated with working correlation choices for application with binary response data. We investigate data characteristics or design features (e.g. cluster size, overall response association, functional form of the response association, covariate distribution, and others) that influence the small and large sample characteristics of parameter estimates obtained from several different weighting schemes or equivalently 'working' covariance models. We find that the impact of covariance model choice depends highly on the specific structure of the data features, and that key aspects should be examined before choosing a weighting scheme.
引用
收藏
页码:633 / 652
页数:20
相关论文
共 27 条
[1]  
[Anonymous], 2002, ANAL LONGITUDINAL DA
[2]   MODELING MULTIVARIATE BINARY DATA WITH ALTERNATING LOGISTIC REGRESSIONS [J].
CAREY, V ;
ZEGER, SL ;
DIGGLE, P .
BIOMETRIKA, 1993, 80 (03) :517-526
[3]  
CROWDER M, 1995, BIOMETRIKA, V82, P407
[4]   Bias in GEE estimates from misspecified models for longitudinal data [J].
Emond, MJ ;
Ritz, J ;
Oakes, D .
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1997, 26 (01) :15-32
[5]   A METHOD FOR GENERATING HIGH-DIMENSIONAL MULTIVARIATE BINARY VARIATES [J].
EMRICH, LJ ;
PIEDMONTE, MR .
AMERICAN STATISTICIAN, 1991, 45 (04) :302-304
[6]   A MODEL FOR BINARY TIME-SERIES DATA WITH SERIAL ODDS RATIO PATTERNS [J].
FITZMAURICE, GM ;
LIPSITZ, SR .
APPLIED STATISTICS-JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C, 1995, 44 (01) :51-61
[7]   A CAVEAT CONCERNING INDEPENDENCE ESTIMATING EQUATIONS WITH MULTIVARIATE BINARY DATA [J].
FITZMAURICE, GM .
BIOMETRICS, 1995, 51 (01) :309-317
[8]   REGRESSION-MODELS FOR DISCRETE LONGITUDINAL RESPONSES [J].
FITZMAURICE, GM ;
LAIRD, NM ;
ROTNITZKY, AG .
STATISTICAL SCIENCE, 1993, 8 (03) :284-299
[9]   Lorelogram: A regression approach to exploring dependence in longitudinal categorical responses [J].
Heagerty, PJ ;
Zeger, SL .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1998, 93 (441) :150-162
[10]   Marginal structural models to estimate the joint causal effect of nonrandomized treatments [J].
Hernán, MA ;
Brumback, B ;
Robins, JM .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (454) :440-448