Analysis of correlated ROC areas in diagnostic testing

被引:62
作者
Song, HH
机构
[1] Department of Biostatistics, Catholic University, Medical College, Socho-Ku, Seoul, 137-701, 505, Banpo-Dong
关键词
ANOVA of pseudovalues; correlated ROC areas; diagnostic testing; repeated measurements; Wilcoxon-Mann-Whitney statistics;
D O I
10.2307/2533123
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper focuses on methods of analysis of areas under receiver operating characteristic (ROC) curves. Analysis of ROC areas should incorporate the correlation structure of repeated measurements taken on the same set of cases and the paucity of measurements per treatment resulting from an effective summarization of cases into a few area measures of diagnostic accuracy. The repeated nature of ROC data has been taken into consideration in the analysis methods previously suggested by Swets and Pickett (1982, Evaluation of Diagnostic Systems: Methods from Signal Detection Theory), Hanley and McNeil (1983, Radiology 148, 839-843), and DeLong, DeLong;, and Clarke-Pearson (1988, Biometrics 44, 837-845). DeLong et al.'s procedure is extended to a Wald test for general situations of diagnostic testing. The method of analyzing jackknife pseudovalues by treating them as data is extremely useful when the number of area measures to be tested is quite small. The Wald test based on covariances of multivariate multisample U-statistics is compared with two approaches of analyzing pseudovalues, the univariate mixed-model analysis of variance (ANOVA) for repeated measurements and the three-way factorial ANOVA. Monte Carlo simulations demonstrate that the three tests give good approximation to the nominal size at the 5% levels for large sample sizes, but the paired t-test using ROC areas as data lacks the power of the other three tests and Hanley and McNeil's method is inappropriate for testing diagnostic accuracies. The Wald statistic performs better than the ANOVAs of pseudovalues. Jackknifing schemes of multiple deletion where different structures of normal and diseased distributions are accounted for appear to perform slightly better than simple multiple-deletion schemes but no appreciable power difference is apparent, and deletion of too many cases at a time may sacrifice power. These methods have important applications in diagnostic testing in ROC studies of radiology and of medicine in general.
引用
收藏
页码:370 / 382
页数:13
相关论文
共 36 条
[1]  
[Anonymous], 1991, SAS SYSTEM LINEAR MO
[2]   JACKKNIFING U-STATISTICS [J].
ARVESEN, JN .
ANNALS OF MATHEMATICAL STATISTICS, 1969, 40 (06) :2076-&
[3]   AREA ABOVE ORDINAL DOMINANCE GRAPH AND AREA BELOW RECEIVER OPERATING CHARACTERISTIC GRAPH [J].
BAMBER, D .
JOURNAL OF MATHEMATICAL PSYCHOLOGY, 1975, 12 (04) :387-415
[4]   ESTIMATING PR(X-LESS-THAN-Y) IN CATEGORIZED DATA USING ROC ANALYSIS [J].
BROWNIE, C .
BIOMETRICS, 1988, 44 (02) :615-621
[5]  
Cox D.R., 1974, THEORETICAL STAT
[6]  
Crowder MJ, 1990, ANAL REPEATED MEASUR
[7]   COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH [J].
DELONG, ER ;
DELONG, DM ;
CLARKEPEARSON, DI .
BIOMETRICS, 1988, 44 (03) :837-845
[8]   MAXIMUM-LIKELIHOOD ESTIMATION OF PARAMETERS OF SIGNAL-DETECTION THEORY AND DETERMINATION OF CONFIDENCE INTERVALS - RATING-METHOD DATA [J].
DORFMAN, DD ;
ALF, E .
JOURNAL OF MATHEMATICAL PSYCHOLOGY, 1969, 6 (03) :487-&
[9]   RECEIVER OPERATING CHARACTERISTIC RATING ANALYSIS - GENERALIZATION TO THE POPULATION OF READERS AND PATIENTS WITH THE JACKKNIFE METHOD [J].
DORFMAN, DD ;
BERBAUM, KS ;
METZ, CE .
INVESTIGATIVE RADIOLOGY, 1992, 27 (09) :723-731
[10]   RSCORE-J - POOLED RATING-METHOD DATA - A COMPUTER-PROGRAM FOR ANALYZING POOLED ROC CURVES [J].
DORFMAN, DD ;
BERBAUM, KS .
BEHAVIOR RESEARCH METHODS INSTRUMENTS & COMPUTERS, 1986, 18 (05) :452-462