MODELS FOR CATEGORICAL-DATA WITH NONIGNORABLE NONRESPONSE

被引:57
作者
PARK, T
BROWN, MB
机构
[1] UNIV MICHIGAN,DEPT BIOSTAT,ANN ARBOR,MI 48109
[2] NICHHD,BIOMETRY & MATH STAT BRANCH,BETHESDA,MD 20892
关键词
BAYESIAN ESTIMATOR; EM ALGORITHM; LOG-LINEAR MODEL; MAXIMUM LIKELIHOOD ESTIMATOR; NONIGNORABLE MISSING DATA; PRIOR DISTRIBUTION; SMOOTHING;
D O I
10.2307/2291199
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
When categorical outcomes are subject to nonignorable nonresponse, log-linear models may be used to adjust for the nonresponse. The models are fitted to the data in an augmented frequency table in which one index corresponds to whether or not the subject is a respondent. The likelihood function is maximized over pseudo-observed cell frequencies with respect to this log-linear model using an EM algorithm. Each E step of the EM algorithm determines the pseudo-observed cell frequencies, and the M step yields the maximum likelihood estimators (MLE's) of these pseudo-observed cell frequencies. This approach may produce boundary estimates for the expected cell frequencies of the nonrespondents. In these cases the estimators of the log-linear model parameters are not uniquely determined and may be unstable. Following the approach of Clogg et al., we propose a Bayesian method that uses smoothing constants to adjust the pseudo-observed cell frequencies so that the solution is not on the boundary. The role of smoothing constants is similar to that of the flattening constant k in ridge regression: the use of k is intended to overcome ill-conditioned situations where correlations between the various predictors in the regression model produce unstable parameter estimates. The Bayesian estimation procedure is illustrated using data from a cross-sectional study of obesity in school-age children. Through a simulation study, we show that when fitting nonignorable nonresponse models, the mean squared errors of the expected cell frequencies obtained by the Bayesian procedure can be much smaller than those of the MLE's.
引用
收藏
页码:44 / 52
页数:9
相关论文
共 28 条
[1]  
Agresti A., 1990, CATEGORICAL DATA ANA
[2]  
ALBERT JH, 1985, BAYESIAN STATISTICS, V2, P589
[3]   CLOSED-FORM ESTIMATES FOR MISSING COUNTS IN 2-WAY CONTINGENCY-TABLES [J].
BAKER, SG ;
ROSENBERGER, WF ;
DERSIMONIAN, R .
STATISTICS IN MEDICINE, 1992, 11 (05) :643-657
[4]   REGRESSION-ANALYSIS FOR CATEGORICAL VARIABLES WITH OUTCOME SUBJECT TO NONIGNORABLE NONRESPONSE [J].
BAKER, SG ;
LAIRD, NM .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1988, 83 (401) :62-69
[5]  
BASU D, 1983, J STAT PLANNING INFE, V6, P345
[6]  
Bishop Y., 1975, DISCRETE MULTIVARIAT
[7]   MULTINOMIAL SAMPLING WITH PARTIALLY CATEGORIZED DATA [J].
BLUMENTHAL, S .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1968, 63 (322) :542-551
[8]   PROTECTING AGAINST NONRANDOMLY MISSING DATA IN LONGITUDINAL-STUDIES [J].
BROWN, CH .
BIOMETRICS, 1990, 46 (01) :143-155
[9]   AN EFFICIENT 2-STAGE PROCEDURE FOR GENERATING RANDOM VARIATES FROM THE MULTINOMIAL DISTRIBUTION [J].
BROWN, MB ;
BROMBERG, J .
AMERICAN STATISTICIAN, 1984, 38 (03) :216-219
[10]   2-DIMENSIONAL CONTINGENCY-TABLES WITH BOTH COMPLETELY AND PARTIALLY CROSS-CLASSIFIED DATA [J].
CHEN, T ;
FIENBERG, SE .
BIOMETRICS, 1974, 30 (04) :629-642