Geographic knowledge extraction and semantic similarity in OpenStreetMap

被引:84
作者
Ballatore, Andrea [1 ]
Bertolotto, Michela [1 ]
Wilson, David C. [2 ]
机构
[1] Univ Coll Dublin, Sch Comp Sci & Informat, Dublin 4, Ireland
[2] Univ N Carolina, Dept Software & Informat Syst, Charlotte, NC 28223 USA
基金
爱尔兰科学基金会;
关键词
Semantic similarity; OpenStreetMap; Volunteered Geographic Information; OSM Semantic Network; SimRank; P-Rank; Co-citation; Crowdsourcing; CORRELATION-COEFFICIENTS; METAANALYSIS; INFORMATION; WEB;
D O I
10.1007/s10115-012-0571-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, a web phenomenon known as Volunteered Geographic Information (VGI) has produced large crowdsourced geographic data sets. OpenStreetMap (OSM), the leading VGI project, aims at building an open-content world map through user contributions. OSM semantics consists of a set of properties (called 'tags') describing geographic classes, whose usage is defined by project contributors on a dedicated Wiki website. Because of its simple and open semantic structure, the OSM approach often results in noisy and ambiguous data, limiting its usability for analysis in information retrieval, recommender systems and data mining. Devising a mechanism for computing the semantic similarity of the OSM geographic classes can help alleviate this semantic gap. The contribution of this paper is twofold. It consists of (1) the development of the OSM Semantic Network by means of a web crawler tailored to the OSM Wiki website; this semantic network can be used to compute semantic similarity through co-citation measures, providing a novel semantic tool for OSM and GIS communities; (2) a study of the cognitive plausibility (i.e. the ability to replicate human judgement) of co-citation algorithms when applied to the computation of semantic similarity of geographic concepts. Empirical evidence supports the usage of co-citation algorithms-SimRank showing the highest plausibility-to compute concept similarity in a crowdsourced semantic network.
引用
收藏
页码:61 / 81
页数:21
相关论文
共 62 条
[1]  
Adafre S., 2005, Proceedings of the 3rd International Workshop on Link Discovery, P90
[2]   CALCULATING CONFIDENCE-INTERVALS FOR REGRESSION AND CORRELATION [J].
ALTMAN, DG ;
GARDNER, MJ .
BRITISH MEDICAL JOURNAL, 1988, 296 (6631) :1238-1242
[3]  
[Anonymous], STUDIES COM IN PRESS
[4]  
[Anonymous], 2002, P 8 ACM SIGKDD INT C
[5]  
[Anonymous], KNOWL INF SYST
[6]  
[Anonymous], 2009, N AM CHAPTER ASS COM
[7]  
[Anonymous], P AAAI SPRING S SEM
[8]  
[Anonymous], 14 LING RES CTR
[9]  
[Anonymous], TECHNICAL REPORT
[10]  
[Anonymous], CEUR WORKSHOP P