On the use of zero-inflated and Hurdle models for modeling vaccine adverse event count data

被引:190
作者
Rose, C. E.
Martin, S. W.
Wannemuehler, K. A.
Plikaytis, B. D.
机构
[1] CDC, Bacterial Vaccine Preventable Dis Branch, Div Epidemiol & Surveillance, Atlanta, GA 30333 USA
[2] Ctr Dis Control & Prevent, Biostat Off, Div Bacterial & Mycot Dis, Natl Ctr Infect Dis, Atlanta, GA USA
关键词
excess zeroes; Hurdle model; Negative Binomial; Poisson; vaccine adverse events; zero-inflated model;
D O I
10.1080/10543400600719384
中图分类号
R9 [药学];
学科分类号
1007 [药学];
摘要
We compared several modeling strategies for vaccine adverse event count data in which the data are characterized by excess zeroes and heteroskedasticity. Count data are routinely modeled using Poisson and Negative Binomial ( NB) regression but zero-inflated and hurdle models may be advantageous in this setting. Here we compared the fit of the Poisson, Negative Binomial ( NB), zero-inflated Poisson ( ZIP), zero-inflated Negative Binomial (ZINB), Poisson Hurdle (PH), and Negative Binomial Hurdle (NBH) models. In general, for public health studies, we may conceptualize zero-inflated models as allowing zeroes to arise from at-risk and not-at-risk populations. In contrast, hurdle models may be conceptualized as having zeroes only from an at-risk population. Our results illustrate, for our data, that the ZINB and NBH models are preferred but these models are indistinguishable with respect to fit. Choosing between the zero-inflated and hurdle modeling framework, assuming Poisson and NB models are inadequate because of excess zeroes, should generally be based on the study design and purpose. If the study's purpose is inference then modeling framework should be considered. For example, if the study design leads to count endpoints with both structural and sample zeroes then generally the zero-inflated modeling framework is more appropriate, while in contrast, if the endpoint of interest, by design, only exhibits sample zeroes ( e. g., at risk participants) then the hurdle model framework is generally preferred. Conversely, if the study's primary purpose it is to develop a prediction model then both the zero-inflated and hurdle modeling frameworks should be adequate.
引用
收藏
页码:463 / 481
页数:19
相关论文
共 23 条
[1]
[Anonymous], 1997, REGRESSION MODELS CA
[2]
Repeated measures with zeros [J].
Berk, KN ;
Lachenbruch, PA .
STATISTICAL METHODS IN MEDICAL RESEARCH, 2002, 11 (04) :303-316
[3]
Zero-inflated models for regression analysis of count data: a study of growth and development [J].
Bin Cheung, Y .
STATISTICS IN MEDICINE, 2002, 21 (10) :1461-1469
[4]
The zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology [J].
Böhning, D ;
Dietz, E ;
Schlattmann, P ;
Mendonça, L ;
Kirchner, U .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1999, 162 :195-209
[5]
BROEK VJ, 1995, BIOMETRICS, V51, P738
[6]
The decision-making process of health care utilization in Mexico [J].
Brown, CJ ;
Pagán, JA ;
Rodríguez-Oreggia, E .
HEALTH POLICY, 2005, 72 (01) :81-91
[7]
Cameron AC, 1998, Regression Analysis of Count Data
[8]
Modeling count data with excess zeroes - An empirical application to traffic accidents [J].
Chin, HC ;
Quddus, MA .
SOCIOLOGICAL METHODS & RESEARCH, 2003, 32 (01) :90-116
[9]
Modelling correlated zero-inflated count data [J].
Dobbie, MJ ;
Welsh, AH .
AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2001, 43 (04) :431-444
[10]
Selecting a distributional assumption for modelling relative densities of benthic macroinvertebrates [J].
Gray, BR .
ECOLOGICAL MODELLING, 2005, 185 (01) :1-12