Inter-Coder Agreement for Computational Linguistics

被引:792
作者
Artstein, Ron
Poesio, Massimo [1 ,2 ]
机构
[1] Univ Essex, Dept Comp & Elect Syst, Colchester CO4 3SQ, Essex, England
[2] Univ Trento, CIMeC, I-38068 Rovereto, TN, Italy
关键词
D O I
10.1162/coli.07-034-R2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article is a survey of methods for measuring agreement among corpus annotators. It exposes the mathematics and underlying assumptions of agreement coefficients, covering Krippendorff's alpha as well as Scott's pi and Cohen's kappa; discusses the use of coefficients in several annotation tasks; and argues that weighted, alpha-like coefficients, traditionally less used than kappa-like measures in computational linguistics, may be more appropriate for many corpus annotation tasks-but that their use makes the interpretation of the value of the coefficient even harder.
引用
收藏
页码:555 / 596
页数:42
相关论文
共 109 条
[71]   Tagger evaluation given hierarchical tag sets [J].
Melamed, ID ;
Resnik, P .
COMPUTERS AND THE HUMANITIES, 2000, 34 (1-2) :79-84
[72]  
MIESKES M, 2006, P LREC GEN, P935
[73]  
MIHALCEA R, 2004, P SENS 3 3 INT WORKS, P25
[74]  
MILTSAKAKI E, 2004, P HLT NAACL WORKSH F, P9
[75]  
MOSER MG, 1996, 9617 U PITTSB DEP CO
[76]  
NAVARRETTA C, 2000, P 1 SIGDIAL WORKSH D, P56
[77]  
Nenkova A, 2004, HLT-NAACL 2004: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, P145
[78]  
Neuendorf K. A., 2002, The Content Analysis Guidebook
[79]  
PALMER M, 2007, NAT LANG ENG, V13, P137
[80]  
Passonneau R, 2006, P 5 INT C LANG RES E, P831