Support Vector Learning for Semantic Argument Classification

被引:2
作者
Sameer Pradhan
Kadri Hacioglu
Valerie Krugler
Wayne Ward
James H. Martin
Daniel Jurafsky
机构
[1] University of Colorado,The Center for Spoken Language Research
[2] Stanford University,undefined
[3] Stanford University,undefined
来源
Machine Learning | 2005年 / 60卷
关键词
shallow semantic parsing; support vector machines;
D O I
暂无
中图分类号
学科分类号
摘要
The natural language processing community has recently experienced a growth of interest in domain independent shallow semantic parsing—the process of assigning a Who did What to Whom, When, Where, Why, How etc. structure to plain text. This process entails identifying groups of words in a sentence that represent these semantic arguments and assigning specific labels to them. It could play a key role in NLP tasks like Information Extraction, Question Answering and Summarization. We propose a machine learning algorithm for semantic role parsing, extending the work of Gildea and Jurafsky (2002), Surdeanu et al. (2003) and others. Our algorithm is based on Support Vector Machines which we show give large improvement in performance over earlier classifiers. We show performance improvements through a number of new features designed to improve generalization to unseen data, such as automatic clustering of verbs. We also report on various analytic studies examining which features are most important, comparing our classifier to other machine learning algorithms in the literature, and testing its generalization to new test set from different genre. On the task of assigning semantic labels to the PropBank (Kingsbury, Palmer, & Marcus, 2002) corpus, our final system has a precision of 84% and a recall of 75%, which are the best results currently reported for this task. Finally, we explore a completely different architecture which does not requires a deep syntactic parse. We reformulate the task as a combined chunking and classification problem, thus allowing our algorithm to be applied to new languages or genres of text for which statistical syntactic parsers may not be available.
引用
收藏
页码:11 / 39
页数:28
相关论文
共 14 条
[1]  
Bikel D. M.(1999)An algorithm that learns what’s in a name Machine Learning 34 211-231
[2]  
Schwartz R.(1998)A tutorial on support vector machines for pattern recognition Data Mining and Knowledge Discovery 2 121-167
[3]  
Weischedel R. M.(2002)Automatic labeling of semantic roles Computational Linguistics 28 245-288
[4]  
Burges C. J. C.(2002)Text classification using string kernels Journal of Machine Learning Research 2 419-444
[5]  
Gildea D.(1986)Induction of decision trees Machine Learning 1 81-106
[6]  
Jurafsky D.(2001)Knowledge discovery in grammatically analysed corpora Data Mining and Knowledge Discovery 5 305-335
[7]  
Lodhi H.(undefined)undefined undefined undefined undefined-undefined
[8]  
Saunders C.(undefined)undefined undefined undefined undefined-undefined
[9]  
Shawe-Taylor J.(undefined)undefined undefined undefined undefined-undefined
[10]  
Cristianini N.(undefined)undefined undefined undefined undefined-undefined