On the use of zero-inflated and Hurdle models for modeling vaccine adverse event count data

被引：190

作者：

Rose, C. E.

Martin, S. W.

Wannemuehler, K. A.

Plikaytis, B. D.

机构：

[1] CDC, Bacterial Vaccine Preventable Dis Branch, Div Epidemiol & Surveillance, Atlanta, GA 30333 USA

[2] Ctr Dis Control & Prevent, Biostat Off, Div Bacterial & Mycot Dis, Natl Ctr Infect Dis, Atlanta, GA USA

来源：

JOURNAL OF BIOPHARMACEUTICAL STATISTICS | 2006年 / 16卷 / 04期

关键词：

excess zeroes; Hurdle model; Negative Binomial; Poisson; vaccine adverse events; zero-inflated model;

D O I：

10.1080/10543400600719384

中图分类号：

R9 [药学];

学科分类号：

1007 [药学];

摘要：

We compared several modeling strategies for vaccine adverse event count data in which the data are characterized by excess zeroes and heteroskedasticity. Count data are routinely modeled using Poisson and Negative Binomial ( NB) regression but zero-inflated and hurdle models may be advantageous in this setting. Here we compared the fit of the Poisson, Negative Binomial ( NB), zero-inflated Poisson ( ZIP), zero-inflated Negative Binomial (ZINB), Poisson Hurdle (PH), and Negative Binomial Hurdle (NBH) models. In general, for public health studies, we may conceptualize zero-inflated models as allowing zeroes to arise from at-risk and not-at-risk populations. In contrast, hurdle models may be conceptualized as having zeroes only from an at-risk population. Our results illustrate, for our data, that the ZINB and NBH models are preferred but these models are indistinguishable with respect to fit. Choosing between the zero-inflated and hurdle modeling framework, assuming Poisson and NB models are inadequate because of excess zeroes, should generally be based on the study design and purpose. If the study's purpose is inference then modeling framework should be considered. For example, if the study design leads to count endpoints with both structural and sample zeroes then generally the zero-inflated modeling framework is more appropriate, while in contrast, if the endpoint of interest, by design, only exhibits sample zeroes ( e. g., at risk participants) then the hurdle model framework is generally preferred. Conversely, if the study's primary purpose it is to develop a prediction model then both the zero-inflated and hurdle modeling frameworks should be adequate.

引用

页码：463 / 481

页数：19

共 23 条

[1]

[Anonymous], 1997, REGRESSION MODELS CA

[2]

Repeated measures with zeros [J].