The "Laboratory" effect: Comparing radiologists' performance and variability during prospective clinical and laboratory mammography interpretations

被引:155
作者
Gur, David [1 ]
Bandos, Andriy I. [2 ]
Cohen, Cathy S. [1 ]
Hakim, Christiane M. [1 ]
Hardesty, Lara A. [3 ]
Ganott, Marie A. [1 ]
Perrin, Ronald L. [1 ]
Poller, William R. [4 ]
Shah, Ratan [1 ]
Sumkin, Jules H. [1 ]
Wallace, Luisa P. [1 ]
Rockette, Howard E. [2 ]
机构
[1] Univ Pittsburgh, Sch Med, Dept Radiol, Pittsburgh, PA 15213 USA
[2] Univ Pittsburgh, Sch Med, Dept Biostat, Pittsburgh, PA 15213 USA
[3] Univ Colorado Hosp, Breast Ctr, Aurora, CO USA
[4] W Penn Allegheny Hlth Syst, Pittsburgh, PA USA
基金
美国国家卫生研究院;
关键词
D O I
10.1148/radiol.2491072025
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 [临床医学]; 100207 [影像医学与核医学]; 1009 [特种医学];
摘要
Purpose: To compare radiologists' performance during interpretation of screening mammograms in the clinic with their performance when reading the same mammograms in a retrospective laboratory study. Materials and Methods: This study was conducted under an institutional review board-approved, HIPAA-compliant protocol; the need for informed consent was waived. Nine experienced radiologists rated an enriched set of mammograms that they had personally read in the clinic (the "reader-specific" set) mixed with an enriched "common" set of mammograms that none of the participants had previously read in the clinic by using a screening Breast Imaging Reporting and Data System (BI-RADS) rating scale. The original clinical recommendations to recall the women for a diagnostic work-up, for both reader-specific and common sets, were compared with their recommendations during the retrospective experiment. The results are presented in terms of reader-specific and group-averaged sensitivity and specificity levels and the dispersion (spread) of reader-specific performance estimates. Results: On average, the radiologists' performance was significantly better in the clinic than in the laboratory (P = .035). Interreader dispersion of the computed performance levels was significantly lower during the clinical interpretations (P < .01). Conclusion: Retrospective laboratory experiments may not represent either expected performance levels or interreader variability during clinical interpretations of the same set of mammograms in the clinical environment well. (C) RSNA, 2008.
引用
收藏
页码:47 / 53
页数:7
相关论文
共 30 条
[1]
[Anonymous], 2003, BREAST IM REP DAT SY
[2]
Association of volume and volume-independent factors with accuracy in screening mammogram interpretation [J].
Beam, CA ;
Conant, EF ;
Sickles, EA .
JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2003, 95 (04) :282-290
[3]
Variability in the interpretation of screening mammograms by US radiologists - Findings from a national sample [J].
Beam, CA ;
Layde, PM ;
Sullivan, DC .
ARCHIVES OF INTERNAL MEDICINE, 1996, 156 (02) :209-213
[4]
Components-of-variance models and multiple-bootstrap experiments: An alternative method for random-effects, receiver operating characteristic analysis [J].
Beiden, SV ;
Wagner, RF ;
Campbell, G .
ACADEMIC RADIOLOGY, 2000, 7 (05) :341-349
[5]
Components-of-variance models for random-effects ROC analysis: The case of unequal variance structures across modalities [J].
Beiden, SV ;
Wagner, RF ;
Campbell, G ;
Metz, CE ;
Jiang, YL .
ACADEMIC RADIOLOGY, 2001, 8 (07) :605-615
[6]
Analysis of uncertainties in estimates of components of variance in multivariate ROC analysis [J].
Beiden, SV ;
Wagner, RF ;
Campbell, G ;
Chan, HP .
ACADEMIC RADIOLOGY, 2001, 8 (07) :616-622
[7]
Inter- and intraobserver variability in the evaluation of dynamic breast cancer MRI [J].
Beresford, Mark J. ;
Padhani, Anwar R. ;
Taylor, N. Jane ;
Ah-See, Mei-Lin ;
Stirling, J. James ;
Makris, Andreas ;
d'Arcy, James A. ;
Collins, David J. .
JOURNAL OF MAGNETIC RESONANCE IMAGING, 2006, 24 (06) :1316-1325
[8]
COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH [J].
DELONG, ER ;
DELONG, DM ;
CLARKEPEARSON, DI .
BIOMETRICS, 1988, 44 (03) :837-845
[9]
RECEIVER OPERATING CHARACTERISTIC RATING ANALYSIS - GENERALIZATION TO THE POPULATION OF READERS AND PATIENTS WITH THE JACKKNIFE METHOD [J].
DORFMAN, DD ;
BERBAUM, KS ;
METZ, CE .
INVESTIGATIVE RADIOLOGY, 1992, 27 (09) :723-731
[10]
Context bias - A problem in diagnostic radiology [J].
Egglin, TKP ;
Feinstein, AR .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 1996, 276 (21) :1752-1755