ZERO-INFLATED POISSON REGRESSION, WITH AN APPLICATION TO DEFECTS IN MANUFACTURING

被引:2538
作者
LAMBERT, D
机构
关键词
EM ALGORITHM; NEGATIVE BINOMIAL; OVERDISPERSION; POSITIVE POISSON; QUALITY CONTROL;
D O I
10.2307/1269547
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Zero-inflated Poisson (ZIP) regression is a model for count data with excess zeros. It assumes that with probability p the only possible observation is 0, and with probability 1 - p, a Poisson(lambda) random variable is observed. For example, when manufacturing equipment is properly aligned, defects may be nearly impossible. But when it is misaligned, defects may occur according to a Poisson(lambda) distribution. Both the probability p of the perfect, zero defect state and the mean number of defects-lambda in the imperfect state-may depend on covariates. Sometimes p and lambda are unrelated; other times p is a simple function of lambda such as p = 1/(1 + lambda(tau)) for an unknown constant-tau. In either case, ZIP regression models are easy to fit. The maximum likelihood estimates (MLE's) are approximately normal in large samples, and confidence intervals can be constructed by inverting likelihood ratio tests or using the approximate normality of the MLE's. Simulations suggest that the confidence intervals based on likelihood ratio tests are better, however. Finally, ZIP regression models are not only easy to interpret, but they can also lead to more refined data analyses. For example, in an experiment concerning soldering defects on printed wiring boards, two sets of conditions gave about the same mean number of defects, but the perfect state was more likely under one set of conditions and the mean number of defects in the imperfect state was smaller under the other set of conditions; that is, ZIP regression can show not only which conditions give lower mean number of defects but also why the means are lower.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 14 条
  • [1] Becker R. A., 1988, NEW S LANGUAGE
  • [2] Chambers J M, 1977, COMPUTATIONAL METHOD
  • [3] Chambers JM., 1992, STAT MODELS S WADSWO, P145, DOI DOI 10.1201/9780203738535
  • [4] Cohen A.C., 1965, P INT S CLASS CONT D, P373
  • [5] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [6] MIXTURE-MODELS IN SURVIVAL ANALYSIS - ARE THEY WORTH THE RISK
    FAREWELL, VT
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1986, 14 (03): : 257 - 262
  • [7] FEUERVERGER A, 1979, BIOMETRIKA, V66, P665
  • [8] SUBROUTINES FOR UNCONSTRAINED MINIMIZATION USING A MODEL TRUST-REGION APPROACH
    GAY, DM
    [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1983, 9 (04): : 503 - 524
  • [9] HEILBRON DC, 1989, UNPUB GENERALIZED LI
  • [10] JOHNSON NL, 1969, DISTRIBUTIONS STATIS