Experiments on the automatic induction of German semantic verb classes

被引:63
作者
Walde, Sabine Schulte im [1 ]
机构
[1] Univ Saarland, D-6600 Saarbrucken, Germany
关键词
D O I
10.1162/coli.2006.32.2.159
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents clustering experiments on German verbs: A statistical grammar model for German serves as the source for a distributional verb description at the lexical syntax-semantics interface, and the unsupervised clustering algorithm k-means uses the empirical verb properties to perform an automatic induction of verb classes. Various evaluation measures are applied to compare the clustering results to gold standard German semantic verb classes under different criteria. The primary goals of the experiments are (1) to empirically utilize and investigate the well-established relationship between verb meaning and verb behavior within a cluster analysis and (2) to investigate the required technical parameters of a cluster analysis with respect to this specific linguistic task. The clustering methodology is developed on a small-scale verb set and then applied to a larger-scale verb set including 883 German verbs.
引用
收藏
页码:159 / 194
页数:36
相关论文
共 54 条
[11]   Similarity-based models of word cooccurrence probabilities [J].
Dagan, I ;
Lee, L ;
Pereira, FCN .
MACHINE LEARNING, 1999, 34 (1-3) :43-69
[12]   Dimensionality reduction of unsupervised data [J].
Dash, M ;
Liu, H ;
Yao, L .
NINTH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 1997, :532-539
[13]  
DOOR BJ, 1996, P 16 INT C COMP LING, P322
[14]  
Dorr B. J., 1997, Machine Translation, V12, P271, DOI 10.1023/A:1007965530302
[15]  
ERK K, 2003, P 5 INT WORKSH COMP
[16]  
Fellbaum C, 1998, WORDNET ELECT LEXICA
[17]  
FILLMORE CJ, 1977, LINGUISTIC STRUCTURE, V59
[18]  
Fillmore J. C., 1982, Linguistics in the Morning Calm, P111
[19]  
FONTENELLE T, 2003, FRAMENET FRAME SEMAN, V16
[20]  
FORGY EW, 1965, BIOMETRICS, V21, P768