Indexes for Three-Class Classification Performance Assessment-An Empirical Comparison

被引：19

作者：

Sampat, Mehul P. ^{[1
]}

Patel, Amit C. ^{[2
]}

Wang, Yuhling ^{[3
]}

Gupta, Shalini ^{[4
]}

Kan, Chih-Wen ^{[5
]}

Bovik, Alan C. ^{[4
]}

Markey, Mia K. ^{[5
]}

机构：

[1] Brigham & Womens Hosp, Ctr Neurol Imaging, Dept Radiol, Boston, MA 02115 USA

[2] Univ Texas SW Med Ctr Dallas, Dallas, TX 75390 USA

[3] Dept Biomed Engn, Charlottesville, VA 22908 USA

[4] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA

[5] Univ Texas Austin, Dept Biomed Engn, Austin, TX 78712 USA

来源：

IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE | 2009年 / 13卷 / 03期

关键词：

Classification evaluation; ideal observer analysis; three-class receiver operating characteristic (ROC); volume under surface (VUS); POLARIZED REFLECTANCE SPECTROSCOPY; ROC SURFACE; DECISION; VOLUME;

D O I：

10.1109/TITB.2008.2009440

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Assessment of classifier performance is critical for fair comparison of methods, including considering alternative models or parameters during system design. The assessment must not only provide meaningful data on the classifier efficacy, but it must do so in a concise and clear manner. For two-class classification problems, receiver operating characteristic analysis provides a clear and concise assessment methodology for reporting performance and comparing competing systems. However, many other important biomedical questions cannot be posed as "two-class" classification tasks and more than two classes are often necessary. While several methods have been proposed for assessing the performance of classifiers for such multiclass problems, none has been widely accepted. The purpose of this paper is to critically review methods that have been proposed for assessing multiclass classifiers. A number of these methods provide a classifier performance index called the volume under surface (VUS). Empirical comparisons are carried out using 4 three-class case studies, in which three popular classification techniques are evaluated with these methods. Since the same classifier was assessed using multiple performance indexes, it is possible to gain insight into the relative strengths and weakness of the measures. We conclude that: 1) the method proposed by Scurfield provides the most detailed description of classifier performance and insight about the sources of error in a given classification task and 2) the methods proposed by He and Nakas also have great practical utility as they provide both the VUS and an estimate of the variance of the VUS. These estimates can be used to statistically compare two classification algorithms.

引用

页码：300 / 312

页数：13

共 29 条

[1] Comparison of ROC umbrella volumes with an application to the assessment of lung cancer diagnostic markers [J].

Alonzo, Todd A. ;

Nakas, Christos T. .

BIOMETRICAL JOURNAL, 2007, 49 (05) :654-664

[2]

American College of Radiology, 2003, ACR BI RADS MAMM ULT, V4th

[3] Computer-aided epiluminescence microscopy of pigmented skin lesions: the value of clinical data for the classification process [J].

Binder, M ;

Kittler, H ;

Dreiseitl, S ;

Ganster, H ;

Wolff, K ;

Pehamberger, H .

MELANOMA RESEARCH, 2000, 10 (06) :556-561

[4] Design of three-class classifiers in computer-aided diagnosis: Monte Carlo simulation study [J].

Chan, HP ;

Sahiner, B ;

Hadjiiski, LM ;

Petrick, N ;

Zhou, C .

MEDICAL IMAGING 2003: IMAGE PROCESSING, PTS 1-3, 2003, 5032 :567-578

[5] COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH [J].

DELONG, ER ;

DELONG, DM ;

CLARKEPEARSON, DI .

BIOMETRICS, 1988, 44 (03) :837-845

[6] Comparing three-class diagnostic tests by three-way ROC analysis [J].

Dreiseitl, S ;

Ohno-Machado, L ;

Binder, M .

MEDICAL DECISION MAKING, 2000, 20 (03) :323-331

[7]

Dreiseitl S, 2001, J BIOMED INFORM, V34, P28, DOI 10.1006/jbin.2001.10004

[8]

Duda R.O., 1973, Pattern Classification and Scene Analysis

[9] Optimization of restricted ROC surfaces in three-class classification tasks [J].

Edwards, Darrin C. ;

Metz, Charles E. .

IEEE TRANSACTIONS ON MEDICAL IMAGING, 2007, 26 (10) :1345-1356

[10] Analysis of proposed three-class classification decision rules in terms of the ideal observer decision [J].

Edwards, Darrin C. ;

Metz, Charles E. .

JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2006, 50 (05) :478-487

← 1 2 3 →