Measuring interrater agreement for ratings of a single target

被引：66

作者：

Lindell, MK

Brandt, CJ

机构：

[1] GEORGE WASHINGTON UNIV,WASHINGTON,DC 20052

[2] MICHIGAN STATE UNIV,E LANSING,MI 48824

来源：

APPLIED PSYCHOLOGICAL MEASUREMENT | 1997年 / 21卷 / 03期

关键词：

aggregation; cross-level analysis; interrater agreement;

D O I：

10.1177/01466216970213006

中图分类号：

O1 [数学]; C [社会科学总论];

学科分类号：

03 ; 0303 ; 0701 ; 070101 ;

摘要：

Researchers assessing interrater agreement for ratings of a single target have increasingly used the r(WG(j)) index, but have found it can display irregular behavior. Mathematical analyses show this problem arises from the use of random response, operationalized by the variance of a uniform distribution (s(EU)(2)), for the baseline of comparison. These analyses suggest that researchers should continue to use r(WG)(j) as a summary measure of interrater agreement, but should use maximum dissensus as a reference distribution for computing r(WG)(j). Although values of s(EU)(2) can be descriptively misleading, they provide an important inferential baseline. Thus, s(EU)(2) should be used in computing chi(2) tests Of the departure of the observed response variance from random responding. Researchers should also examine interrater agreement as a theoretical variable in its own right, investigating the causes and consequences of rater dissensus.

引用

页码：271 / 278

页数：8

共 10 条

[1] DEVIL RIDES AGAIN - CORRELATION AS AN INDEX OF FIT [J].