Linked hypernyms: Enriching DBpedia with Targeted Hypernym Discovery

被引:20
作者
Kliegr, Tomas [1 ,2 ]
机构
[1] Univ Econ, Dept Informat & Knowledge Engn, Fac Informat & Stat, Prague 13067, Czech Republic
[2] Univ London, Multimedia & Vis Res Grp, London E1 4NS, England
来源
JOURNAL OF WEB SEMANTICS | 2015年 / 31卷
关键词
DBpedia; Hearst patterns; Hypernym; Linked data; YAGO; Wikipedia; Type inference;
D O I
10.1016/j.websem.2014.11.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Linked Hypernyms Dataset (LHD) provides entities described by Dutch, English and German Wikipedia articles with types in the DBpedia namespace. The types are extracted from the first sentences of Wikipedia articles using Hearst pattern matching over part-of-speech annotated text and dis-ambiguated to DBpedia concepts. The dataset covers 1.3 million RDF type triples from English Wikipedia, out of which 1 million RDF type triples were found not to overlap with DBpedia, and 0.4 million with YAGO2s. There are about 770 thousand German and 650 thousand Dutch Wikipedia entities assigned a novel type, which exceeds the number of entities in the localized DBpedia for the respective language. RDF type triples from the German dataset have been incorporated to the German DBpedia. Quality assessment was performed altogether based on 16.500 human ratings and annotations. For the English dataset, the average accuracy is 0.86, for German 0.77 and for Dutch 0.88. The accuracy of raw plain text hypernyms exceeds 0.90 for all languages. The LHD release described and evaluated in this article targets DBpedia 3.8, LHD version for the DBpedia 3.9 containing approximately 4.5 million RDF type triples is also available. (C) 2014 The Author. Published by Elsevier B.V.
引用
收藏
页码:59 / 69
页数:11
相关论文
共 21 条
[1]  
[Anonymous], 1992, COLING 1992, DOI DOI 10.3115/992133.992154
[2]  
[Anonymous], P 9 INT C LANG RES E
[3]  
[Anonymous], 2007, AAAI
[4]  
[Anonymous], 2012, SEMANTIC WEB CHALLEN
[5]  
Aprosio Alessio Palmero, 2013, Semantic Web: Semantics and Big Data. Proceedings of 10th International Conference (ESWC 2013): LNCS 7882, P397
[6]   DBpedia - A crystallization point for the Web of Data [J].
Bizer, Christian ;
Lehmann, Jens ;
Kobilarov, Georgi ;
Auer, Soeren ;
Becker, Christian ;
Cyganiak, Richard ;
Hellmann, Sebastian .
JOURNAL OF WEB SEMANTICS, 2009, 7 (03) :154-165
[7]  
Calzolari N., 1984, 10th International Conference on Computational Linguistics. 22nd Annual Meeting of the Association for Computational Linguistics. Proceedings of Coling 84, P170
[8]  
Cimiano P, 2005, LECT NOTES COMPUT SC, V3513, P227
[9]  
Dojchinovski Milan, 2013, Machine Learning and Knowledge Discovery in Databases. European Conference, ECML PKDD 2013. Proceedings: LNCS 8190, P654, DOI 10.1007/978-3-642-40994-3_48
[10]  
Dojchinovski M., 2013, P 6 TEXT AN C TAC 13