The Proposition Bank: An annotated corpus of semantic roles

被引:824
作者
Palmer, M
Kingsbury, P
Gildeafi, D
机构
[1] Univ Penn, Dept Comp & Informat Sci, Philadelphia, PA 19104 USA
[2] Univ Rochester, Dept Comp Sci, Rochester, NY 14627 USA
关键词
D O I
10.1162/0891201053630264
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Proposition Bank project takes a practical approach to semantic representation, adding a layer of predicate-argument information, or semantic role labels, to the syntactic structures of the Penn Treebank. The resulting resource can be thought of as shallow, in that it does not represent coreference, quantification, and many other higher-order phenomena, but also broad, in that it covers every instance of every verb in the corpus and allows representative statistics to be calculated. We discuss the criteria used to define the sets of semantic roles used in the annotation process and to analyze the frequency of syntactic/semantic alternations in the corpus. We describe an automatic system for semantic role tagging trained on the corpus and discuss the effect on its performance of various types of information, including a comparison of full syntactic parsing with a flat representation and the contribution of the empty "trace" categories of the treebank.
引用
收藏
页码:71 / 105
页数:35
相关论文
共 28 条
[1]  
[Anonymous], COLING ACL
[2]  
[Anonymous], 2000, Proceedings of the 18th conference on Computational linguistics
[3]  
[Anonymous], ANLP
[4]  
Baker C.F., 1998, P 36 ANN M ASS COMP, P86, DOI DOI 10.3115/980845.980860
[5]  
Bangalore S, 1999, COMPUT LINGUIST, V25, P237
[6]  
Brent M. R., 1993, Computational Linguistics, V19, P243
[7]  
CARRERAS X, 2004, HLT NAACL 2004 WORKS, P89
[8]  
Carroll J., 1998, P 1 INT C LEXICAL RE, P447
[9]  
Chen J, 2003, PROCEEDINGS OF THE 2003 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, P41
[10]  
DORR BJ, 2000, BREADTH DEPTH SEMANT, P79