The meaning of kappa: Probabilistic concepts of reliability and validity revisited

被引:90
作者
GuggenmoosHolzmann, I
机构
[1] Inst. of Med. Stat. and Info. Sci., Freie Universität Berlin
[2] Inst. F. Med. Stat. und I., Universitatsklinikum Benjamin F., Freie Universität Berlin, D-12 200 Berlin
关键词
diagnostic test; reliability; validity; kappa; chance-corrected agreement; chance corrected validity;
D O I
10.1016/0895-4356(96)00011-X
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
A framework-the ''agreement concept''-is developed to study the use of Cohen's kappa as well as alternative measures of chance-corrected agreement in a unified manner. Focusing on intrarater consistency it is demonstrated that for 2 x 2 tables an adequate choice between different measures of chance-corrected agreement can be made only if the characteristics of the observational setting are taken into account. In particular, a naive use of Cohen's kappa may lead to strinkingly overoptimistic estimates of chance-corrected agreement. Such bias can be overcome by more elaborate study designs that allow for an unrestricted estimation of the probabilities at issue. When Cohen's kappa is appropriately applied as a measure of chance-corrected agreement, its values prove to be a linear-and not a parabolic-function of true prevalence. It is further shown how the validity of ratings is influenced by lack of consistency. Depending on the design of a validity study, this may lead, on purely formal grounds, to prevalence dependent estimates of sensitivity and specificity. Proposed formulas for ''chance-corrected'' validity indexes fail to adjust for this phenomenon.
引用
收藏
页码:775 / 782
页数:8
相关论文
共 36 条
[1]   MAXIMUM-LIKELIHOOD-ESTIMATION OF AGREEMENT IN THE CONSTANT PREDICTIVE PROBABILITY MODEL, AND ITS RELATION TO COHEN KAPPA [J].
AICKIN, M .
BIOMETRICS, 1990, 46 (02) :293-302
[2]  
[Anonymous], 1981, STAT MEASURES RATES
[3]   USING REPLICATE OBSERVATIONS IN OBSERVER AGREEMENT STUDIES WITH BINARY ASSESSMENTS [J].
BAKER, SG ;
FREEDMAN, LS ;
PARMAR, MKB .
BIOMETRICS, 1991, 47 (04) :1327-1338
[4]   BIASES IN THE ASSESSMENT OF DIAGNOSTIC-TESTS [J].
BEGG, CB .
STATISTICS IN MEDICINE, 1987, 6 (04) :411-423
[5]   COEFFICIENT KAPPA - SOME USES, MISUSES, AND ALTERNATIVES [J].
BRENNAN, RL ;
PREDIGER, DJ .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1981, 41 (03) :687-699
[6]   CHANCE-CORRECTED MEASURES OF THE VALIDITY OF A BINARY DIAGNOSTIC-TEST [J].
BRENNER, H ;
GEFELLER, O .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 1994, 47 (06) :627-633
[7]  
BYTT T, 1993, J CLIN EPIDEMIOL, V46, P423
[8]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46
[9]   SENSITIVITY AND SPECIFICITY-LIKE MEASURES OF THE VALIDITY OF A DIAGNOSTIC-TEST THAT ARE CORRECTED FOR CHANCE AGREEMENT [J].
COUGHLIN, SS ;
PICKLE, LW .
EPIDEMIOLOGY, 1992, 3 (02) :178-181
[10]  
Dawid Alexander Philip, 1979, APPL STAT, V28, P20, DOI [DOI 10.2307/2346806, 10.2307/2346806]