Accurate unlexicalized parsing

被引：1007

作者：

Klein, D ^{[1
]}

Manning, CD ^{[1
]}

机构：

[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA

来源：

41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE | 2003年

关键词：

D O I：

10.3115/1075096.1075150

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP/LR F-1) is better than that of early lexicalized PCFG models, and surprisingly close to the current state-of-the-art. This result has potential uses beyond establishing a strong lower bound on the maximum possible accuracy of unlexicalized models: an unlexicalized PCFG is much more compact, easier to replicate, and easier to interpret than more complex lexical models, and the parsing algorithms are simpler, more widely understood, of lower asymptotic complexity, and easier to optimize.

引用

页码：423 / 430

页数：8

共 19 条

[1]

[Anonymous], 1965, ASPECTS THEORY SYNTA

[2]

[Anonymous], [No title captured]

[3]

[Anonymous], P 6 WORKSH VER LARG

[4]

Charniak E, 1996, PROCEEDINGS OF THE THIRTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE, VOLS 1 AND 2, P1031

[5]

CHARNIAK E, 2000, NAACL, V1, P132

[6]

CHARNIAK E, 2001, ACL, V39

[7]

COLLINS M, 1999, THESIS U PENNSYLVANI

[8]

COLLINS MJ, 1996, ACL, V34, P184

[9]

EISNER J, 1999, ACL, V37, P457

[10]

Ford Marylin, 1982, The mental representation of grammatical relations, P727

← 1 2 →