Inference procedures for assessing interobserver agreement among multiple raters

被引:26
作者
Altaye, M [1 ]
Donner, A
Klar, N
机构
[1] Childrens Hosp Kings Daughters, Ctr Pediat Res, Eastern Virginia Med Sch, Norfolk, VA 23510 USA
[2] Univ Western Ontario, Dept Epidemiol & Biostat, London, ON N6A 5C1, Canada
[3] Dana Farber Canc Inst, Dept Biostat Sci, Boston, MA 02115 USA
关键词
confidence interval; interobserver agreement; multiple rater; sample size; type I error;
D O I
10.1111/j.0006-341X.2001.00584.x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We propose a new procedure for constructing inferences about a measure of interobserver agreement in studies involving a binary outcome and multiple raters. The proposed procedure, based on a chi-square goodness-of-fit test as applied to the correlated binomial model (Bahadur, 1961, in Studies in Item Analysis and Prediction, 158-176), is an extension of the goodness-of-fit procedure developed by Donner and Eliasziw (1992, Statistics in Medicine 11, 1511-1519) for the case of two raters. The new procedure is shown to provide confidence-interval coverage levels that are close to nominal over a wide range of parameter combinations. The procedure also provides a sample-size formula that may be used to determine the required number of subjects and raters for such studies.
引用
收藏
页码:584 / 588
页数:5
相关论文
共 14 条
[1]  
BAHADUR RR, 1961, STANFORD MATH STUDIE, V6, P158
[2]   2X2 KAPPA-COEFFICIENTS - MEASURES OF AGREEMENT OR ASSOCIATION [J].
BLOCH, DA ;
KRAEMER, HC .
BIOMETRICS, 1989, 45 (01) :269-287
[3]   A GOODNESS-OF-FIT APPROACH TO INFERENCE PROCEDURES FOR THE KAPPA-STATISTIC - CONFIDENCE-INTERVAL CONSTRUCTION, SIGNIFICANCE-TESTING AND SAMPLE-SIZE ESTIMATION [J].
DONNER, A ;
ELIASZIW, M .
STATISTICS IN MEDICINE, 1992, 11 (11) :1511-1519
[4]  
Fisher RA, 1958, STAT METHODS RES WOR
[5]  
Fleiss JL, 1981, STAT METHODS RATES P
[6]   FULL LIKELIHOOD PROCEDURE FOR ANALYZING EXCHANGEABLE BINARY DATA [J].
GEORGE, EO ;
BOWMAN, D .
BIOMETRICS, 1995, 51 (02) :512-523
[7]   INTERVAL ESTIMATION UNDER 2 STUDY DESIGNS FOR KAPPA WITH BINARY CLASSIFICATIONS [J].
HALE, CA ;
FLEISS, JL .
BIOMETRICS, 1993, 49 (02) :523-534
[8]   HOW MANY RATERS - TOWARD THE MOST RELIABLE DIAGNOSTIC CONSENSUS [J].
KRAEMER, HC .
STATISTICS IN MEDICINE, 1992, 11 (03) :317-331
[9]   MEASUREMENT OF OBSERVER AGREEMENT FOR CATEGORICAL DATA [J].
LANDIS, JR ;
KOCH, GG .
BIOMETRICS, 1977, 33 (01) :159-174