A preliminary investigation of maximum likelihood logistic regression versus exact logistic regression

被引:55
作者
King, EN [1 ]
Ryan, TP [1 ]
机构
[1] Progress Corp, Mayfield Hts, OH 44143 USA
关键词
complete separation; near separation; rare events; sparse data;
D O I
10.1198/00031300283
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Logistic regression is used by practitioners and researchers in many fields, but is undoubtedly used most frequently in medical and biostatistical applications. Maximum likelihood is generally the estimation method of choice, but we show that maximum likelihood can produce very poor results under certain conditions. Specifically, the poor performance of maximum likelihood in the case of rare events is known and we review research on this topic. We primarily examine the performance of maximum likelihood in the presence of near separation, which has apparently not been studied. Exact logistic regression is the logical alternative to maximum likelihood. We offer a comparison of the two methods of estimation.
引用
收藏
页码:163 / 170
页数:8
相关论文
共 20 条
[1]  
ALBERT A, 1984, BIOMETRIKA, V71, P1
[2]  
[Anonymous], 1989, Analysis of binary data
[3]   MUST CLINICAL-TRIALS BE LARGE - THE INTERPRETATION OF P-VALUES AND THE COMBINATION OF TEST-RESULTS [J].
BARNARD, GA .
STATISTICS IN MEDICINE, 1990, 9 (06) :601-614
[4]  
Collett D., 1999, Modelling Binary Data
[5]  
Collett D., 1985, SAINS MALAYS, V14, P493
[6]  
Cox D., 1989, Analysis of Binary Data
[7]   Predictive performance of the binary logit model in unbalanced samples [J].
Cramer, JS .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES D-THE STATISTICIAN, 1999, 48 :85-94
[8]   Logistic regression in an adaptive Web cache [J].
Foong, AP ;
Hu, YH ;
Heisey, DM .
IEEE INTERNET COMPUTING, 1999, 3 (05) :27-36
[9]  
HINES RJO, 1993, APPL STAT, V442, P3
[10]  
KING G, 2001, POLIT ANAL, V0009