Monte Carlo validation of a multireader method for receiver operating characteristic discrete rating data: Factorial experimental design

被引:91
作者
Dorfman, DD
Berbaum, KS
Lenth, RV
Chen, YF
Donaghy, BA
机构
[1] Univ Iowa, Dept Psychol, Iowa City, IA 52242 USA
[2] Univ Iowa, Dept Radiol, Iowa City, IA 52242 USA
[3] Univ Iowa, Dept Stat & Actuarial Sci, Iowa City, IA 52242 USA
关键词
decision theory; diagnostic radiology; receiver operating characteristic curve (ROC);
D O I
10.1016/S1076-6332(98)80294-8
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Rational and Objectives. The authors conducted a series of null-case Monte Carlo simulations to evaluate the Dorfman-Berbaum-Metz (DBM) method for comparing modalities with multireader receiver operating characteristic (ROC) discrete rating data. Materials and Methods. Monte Carlo simulations were performed by using discrete ratings on fully crossed factorial designs with two modalities and three, five, and 10 hypothetical readers. The null hypothesis was true for all simulations. The population ROC areas, latent variable structures, case sample sizes, and normal/abnormal case sample ratios used in another study were used in these simulations. Results, For equal allocation ratios and small (A(z) = 0.702) and moderate (A(z) =0.855) ROC areas, the empirical type I error rate closely matched the nominal a level. For very large ROC areas (A(z) = 0.961), however, the empirical type I error rate was somewhat smaller than the nominal alpha level. This conservatism increased with decreasing case sample size and asymmetric normal/abnormal case allocation ratio, The empirical type I error rate was sometimes slightly larger than the nominal alpha level with many cases and few readers, where there was large residual, relatively small treatment-by-case interaction and relatively large treatment-by-reader interaction, Conclusion. The results suggest that the DBM method provides trustworthy alpha levels with discrete ratings when the ROC area is not too large and case and reader sample sizes are not too small. In other situations, the test tends to be somewhat conservative or slightly liberal.
引用
收藏
页码:591 / 602
页数:12
相关论文
共 32 条
[1]  
*AM COLL RAD, 1993, BREAST IM REP DAT SY
[2]  
[Anonymous], 1983, Statistical methods
[3]   ROBUSTNESS OF F TEST TO VIOLATIONS OF CONTINUITY AND FORM OF TREATMENT POPULATION [J].
BEVAN, MF ;
DENTON, JQ .
BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 1974, 27 (NOV) :199-204
[4]   WAVE-FORM MOMENT ANALYSIS IN PSYCHOPHYSIOLOGICAL RESEARCH [J].
CACIOPPO, JT ;
DORFMAN, DD .
PSYCHOLOGICAL BULLETIN, 1987, 102 (03) :421-438
[5]  
DONALDSON TR, 1968, J AM STAT ASSOC, V3, P660
[6]   MAXIMUM-LIKELIHOOD ESTIMATION OF PARAMETERS OF SIGNAL-DETECTION THEORY AND DETERMINATION OF CONFIDENCE INTERVALS - RATING-METHOD DATA [J].
DORFMAN, DD ;
ALF, E .
JOURNAL OF MATHEMATICAL PSYCHOLOGY, 1969, 6 (03) :487-&
[7]   DEGENERACY AND DISCRETE RECEIVER OPERATING CHARACTERISTIC RATING DATA [J].
DORFMAN, DD ;
BERBAUM, KS .
ACADEMIC RADIOLOGY, 1995, 2 (10) :907-915
[8]   Proper receiver operating characteristic analysis: The bigamma model [J].
Dorfman, DD ;
Berbaum, KS ;
Metz, CE ;
Lenth, RV ;
Hanley, JA ;
AbuDagga, H .
ACADEMIC RADIOLOGY, 1997, 4 (02) :138-149
[9]   MAXIMUM LIKELIHOOD ESTIMATION OF PARAMETERS OF SIGNAL DETECTION THEORY - A DIRECT SOLUTION [J].
DORFMAN, DD ;
ALF, E .
PSYCHOMETRIKA, 1968, 33 (01) :117-&
[10]   MULTIREADER, MULTICASE RECEIVER OPERATING CHARACTERISTIC METHODOLOGY - A BOOTSTRAP ANALYSIS [J].
DORFMAN, DD ;
BERBAUM, KS ;
LENTH, RV .
ACADEMIC RADIOLOGY, 1995, 2 (07) :626-633