Observer variation and the performance accuracy gained by averaging ratings of abnormality

被引:14
作者
Swensson, RG [1 ]
King, JL [1 ]
Good, WF [1 ]
Gur, D [1 ]
机构
[1] Univ Pittsburgh, Dept Radiol, Pittsburgh, PA 15261 USA
关键词
observer variability; accuracy; ROC analysis; median ratings; reader correlations;
D O I
10.1118/1.1286589
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Six radiologists used continuous scales to rate 529 chest-film cases for likelihood of five different types of abnormalities (interstitial disease, nodule, pneumothorax, alveolar infiltrate, and rib fracture! in each of six replicated readings, yielding 36 separate ratings of each case for the five abnormalities. Separate data analyses of all cases and subsets of the difficult/subtle cases for each abnormality estimated the relative gains in accuracy (linear-scaled area below the ROC curve) obtained by averaging the case-ratings across (a) six independent replications by each reader (25% gain), (b) six different readers within each replication (34% gain), or (c) all 36 readings (48% gain). Although accuracy differed among both readers and abnormalities, ROC curves for the median ratings showed similar relative gains in accuracy, somewhat greater than those predicted from the measured rating correlations. A model for variance components in the observer's latent decision variable could predict these gains from measured correlations in the single ratings of cases. Depending on whether the model's estimates were based on realized accuracy gains or on rating correlations, about 48% or 39% of each reader's total decision variance (summed variance for positive and negative cases) consisted of random (within-reader) error that was uncorrelated between replications, another 10% or 14% came from idiosyncratic responses to individual cases, and about 43% or 47% was systematic variation that all readers found in the sampled cases. (C) 2000 American Association of Physicists in Medicine. [S0094-2405(00)00608-8].
引用
收藏
页码:1920 / 1933
页数:14
相关论文
共 21 条
[1]  
BAUMSTARK A, 1984, AM J ROENTGENOL, V141, P877
[2]   VISUAL SIGNAL-DETECTION .4. OBSERVER INCONSISTENCY [J].
BURGESS, AE ;
COLBORNE, B .
JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1988, 5 (04) :617-627
[3]   MAXIMUM-LIKELIHOOD ESTIMATION OF PARAMETERS OF SIGNAL-DETECTION THEORY AND DETERMINATION OF CONFIDENCE INTERVALS - RATING-METHOD DATA [J].
DORFMAN, DD ;
ALF, E .
JOURNAL OF MATHEMATICAL PSYCHOLOGY, 1969, 6 (03) :487-&
[4]   Observer performance assessment of JPEG-compressed high-resolution chest images [J].
Good, WF ;
Maitz, G ;
King, J ;
Gennari, R ;
Gur, D .
MEDICAL IMAGING 1999: IMAGE PERCEPTION AND PERFORMANCE, 1999, 3663 :8-13
[5]   CONSISTENCY OF AUDITORY DETECTION JUDGMENTS [J].
GREEN, DM .
PSYCHOLOGICAL REVIEW, 1964, 71 (05) :392-407
[6]   DISPLAY THRESHOLDING OF IMAGES AND OBSERVER DETECTION PERFORMANCE [J].
JUDY, PF ;
SWENSSON, RG .
JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1987, 4 (05) :954-965
[7]  
KENDALL MG, 1973, ADV THEORY STAT, V2, P7
[8]  
Metz C.Z., 1984, INFORMATION PROCESSI, P432
[9]   GAINS IN ACCURACY FROM REPLICATED READINGS OF DIAGNOSTIC IMAGES - PREDICTION AND ASSESSMENT IN TERMS OF ROC ANALYSIS [J].
METZ, CE ;
SHEN, JH .
MEDICAL DECISION MAKING, 1992, 12 (01) :60-75
[10]  
Metz CE, 1998, STAT MED, V17, P1033, DOI 10.1002/(SICI)1097-0258(19980515)17:9<1033::AID-SIM784>3.0.CO