Automatic building of an ontology on the basis of text corpora in Thai

被引:8
作者
Imsombut, Aurawan [1 ]
Kawtrakul, Asanee [1 ]
机构
[1] Kasetsart Univ, NAiST Lab, Bangkok, Thailand
关键词
Thai ontology learning; lexico-syntactic patterns; taxonomic list;
D O I
10.1007/s10579-007-9045-5
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper presents a methodology for automatic learning of ontologies from Thai text corpora, by extraction of terms and relations. A shallow parser is used to chunk texts on which we identify taxonomic relations with the help of cues: lexico-syntactic patterns and item lists. The main advantage of the approach is that it simplify the task of concept and relation labeling since cues help for identifying the ontological concept and hinting their relation. However, these techniques pose certain problems, i.e. cue word ambiguity, item list identification, and numerous candidate terms. We also propose the methodology to solve these problems by using lexicon and co-occurrence features and weighting them with information gain. The precision, recall and F-measure of the system are 0.74, 0.78 and 0.76, respectively.
引用
收藏
页码:137 / 149
页数:13
相关论文
共 17 条
[1]  
AGIRRE E, 2000, P WORKSH ONT CONSTR
[2]  
AYAN NF, 1999, 8 TURK S ART INT NEU
[3]  
BISSON G, 2000, P WORKSH ONT LEARN 1
[4]  
CHANLEKHA H, 2004, P IJCNLP 2004 HAIN I
[5]  
DUNNING TE, 1994, COMPUTATIONAL LINGUI, V19, P61
[6]  
Girju R., 2003, P HUM LANG TECHN C E
[7]  
Hanks P., 1990, Word association norms, mutual information, and lexicography, V16, P22
[8]  
KAWTRAKUL A, 2004, WORKSH ONTOLE LREC C
[9]   Ontology learning for the Semantic Web [J].
Maedche, A ;
Staab, S .
IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 2001, 16 (02) :72-79
[10]   Ontology learning and its automated terminology translation [J].
Navigli, R ;
Velardi, P ;
Gangemi, A .
IEEE INTELLIGENT SYSTEMS, 2003, 18 (01) :22-31