Comparison of classification accuracy using Cohen's Weighted Kappa

被引:230
作者
Ben-David, Arie [1 ]
机构
[1] Holon Inst Technol, Dept Technol Management, Holon, Israel
关键词
Weighted Cohen's Kappa; sensitivity analysis; cost-sensitive classification; ordinal data sets; expert systems; machine learning;
D O I
10.1016/j.eswa.2006.10.022
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many expert systems solve classification problems. While comparing the accuracy of such classifiers, the cost of error must frequently be taken into account. In such cost-sensitive applications just using the percentage of misses as the sole meter for accuracy can be misleading. Typical examples of such problems are medical and military applications, as well as data sets with ordinal (i.e., ordered) class. A new methodology is proposed here for assessing classifiers accuracy. The approach taken is based on Cohen's Kappa statistic. It compensates for classifications that may be due to chance. The use of Kappa is proposed as a standard meter for measuring the accuracy of all multi-valued classification problems. The use of Weighted Kappa enables to effectively deal with cost-sensitive classification. When the cost of error is unknown and can only be roughly estimated, the use of sensitivity analysis with Weighted Kappa is highly recommended. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:825 / 832
页数:8
相关论文
共 23 条
[1]   Combined 5 x 2 cv F test for comparing supervised classification learning algorithms [J].
Alpaydin, E .
NEURAL COMPUTATION, 1999, 11 (08) :1885-1892
[2]  
Alpaydin Ethem, 2004, Introduction to machine learning
[3]  
BENDAVID A, 1995, MACH LEARN, V19, P29, DOI 10.1007/BF00994659
[4]  
Bradford J. P., 1998, Machine Learning: ECML-98. 10th European Conference on Machine Learning. Proceedings, P131, DOI 10.1007/BFb0026682
[5]  
Cao-Van K, 2002, LECT NOTES COMPUT SC, V2561, P291
[6]   HIGH AGREEMENT BUT LOW KAPPA .2. RESOLVING THE PARADOXES [J].
CICCHETTI, DV ;
FEINSTEIN, AR .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 1990, 43 (06) :551-558
[7]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46
[8]  
Cook RJ., 1998, ENCY BIOSTATISTICS, P2166
[9]   Approximate statistical tests for comparing supervised classification learning algorithms [J].
Dietterich, TG .
NEURAL COMPUTATION, 1998, 10 (07) :1895-1923
[10]  
Domingos P, 2002, P 5 ACM SIGKDD INT C, DOI 10.1145/312129.312220