Disadvantages of using the area under the receiver operating characteristic curve to assess imaging tests: A discussion and proposal for an alternative approach

被引:169
作者
Halligan, Steve [1 ]
Altman, Douglas G. [2 ]
Mallett, Susan [3 ]
机构
[1] UCL, Univ Coll Hosp, Ctr forMed Imaging, London NW1 2BU, England
[2] Univ Oxford, Ctr Stat Med, Oxford, England
[3] Univ Oxford, Dept Primary Care Hlth Sci, Oxford, England
关键词
ROC curve; Sensitivity and specificity; Area under curve; Data interpretation; Statistical; CT colonography; ROC CURVE; MAMMOGRAPHY; PERFORMANCE; SYSTEMS;
D O I
10.1007/s00330-014-3487-0
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
100231 [临床病理学]; 100902 [航空航天医学];
摘要
The objectives are to describe the disadvantages of the area under the receiver operating characteristic curve (ROC AUC) to measure diagnostic test performance and to propose an alternative based on net benefit. We use a narrative review supplemented by data from a study of computer-assisted detection for CT colonography. We identified problems with ROC AUC. Confidence scoring by readers was highly non-normal, and score distribution was bimodal. Consequently, ROC curves were highly extrapolated with AUC mostly dependent on areas without patient data. AUC depended on the method used for curve fitting. ROC AUC does not account for prevalence or different misclassification costs arising from false-negative and false-positive diagnoses. Change in ROC AUC has little direct clinical meaning for clinicians. An alternative analysis based on net benefit is proposed, based on the change in sensitivity and specificity at clinically relevant thresholds. Net benefit incorporates estimates of prevalence and misclassification costs, and it is clinically interpretable since it reflects changes in correct and incorrect diagnoses when a new diagnostic test is introduced. ROC AUC is most useful in the early stages of test assessment whereas methods based on net benefit are more useful to assess radiological tests where the clinical context is known. Net benefit is more useful for assessing clinical impact. The area under the receiver operating characteristic curve (ROC AUC) measures diagnostic accuracy. Confidence scores used to build ROC curves may be difficult to assign. False-positive and false-negative diagnoses have different misclassification costs. Excessive ROC curve extrapolation is undesirable. Net benefit methods may provide more meaningful and clinically interpretable results than ROC AUC.
引用
收藏
页码:932 / 939
页数:8
相关论文
共 28 条
[1]
Applications of ROC Analysis in Medical Research: Recent Developments and Future Directions [J].
Alemayehu, Demissie ;
Zou, Kelly H. .
ACADEMIC RADIOLOGY, 2012, 19 (12) :1457-1464
[2]
[Anonymous], PATTERN RECOGNIT LET
[3]
The central role of receiver operating characteristic (ROC) curves in evaluating tests for the early detection of cancer [J].
Baker, SG .
JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2003, 95 (07) :511-515
[4]
Patients' & Healthcare Professionals' Values Regarding True- & False-Positive Diagnosis when Colorectal Cancer Screening by CT Colonography: Discrete Choice Experiment [J].
Boone, Darren ;
Mallett, Susan ;
Zhu, Shihua ;
Yao, Guiqing Lily ;
Bell, Nichola ;
Ghanouni, Alex ;
von Wagner, Christian ;
Taylor, Stuart A. ;
Altman, Douglas G. ;
Lilford, Richard ;
Halligan, Steve .
PLOS ONE, 2013, 8 (12)
[5]
Variability and errors when applying the BIRADS mammography classification [J].
Boyer, Bruno ;
Canale, Sandra ;
Arfi-Rouche, Julia ;
Monzani, Quentin ;
Khaled, Wassef ;
Balleyguier, Corinne .
EUROPEAN JOURNAL OF RADIOLOGY, 2013, 82 (03) :388-397
[6]
In pursuit of a piece of the ROC [J].
Dwyer, AJ .
RADIOLOGY, 1996, 201 (03) :621-625
[7]
Comparing areas under receiver operating characteristic curves: Potential impact of the "last" experimentally measured operating point [J].
Gur, David ;
Bandos, Andriy I. ;
Rockette, Howard E. .
RADIOLOGY, 2008, 247 (01) :12-15
[8]
"Binary" and "non-binary" detection tasks: Are current performance measures optimal? [J].
Gur, David ;
Rockette, Howard E. ;
Bandos, Andriy I. .
ACADEMIC RADIOLOGY, 2007, 14 (07) :871-876
[9]
Computed tomographic colonography: Assessment of radiologist performance with and without computer-aided detection [J].
Halligan, Steve ;
Altman, Douglas G. ;
Mallett, Susan ;
Taylor, Stuart A. ;
Burling, David ;
Roddie, Mary ;
Honeyfield, Lesley ;
McQuillan, Justine ;
Amin, Hamdan ;
Dehmeshki, Jamshid .
GASTROENTEROLOGY, 2006, 131 (06) :1690-1699
[10]
Incremental Benefit of Computer-aided Detection when Used as a Second and Concurrent Reader of CT Colonographic Data: Multiobserver Study [J].
Halligan, Steve ;
Mallett, Susan ;
Altman, Douglas G. ;
McQuillan, Justine ;
Proud, Maria ;
Beddoe, Gareth ;
Honeyfield, Lesley ;
Taylor, Stuart A. .
RADIOLOGY, 2011, 258 (02) :469-476