How much information is geospatially referenced? Networks and cognition

被引:55
作者
Hahmann, Stefan [1 ]
Burghardt, Dirk [1 ]
机构
[1] Tech Univ Dresden, Inst Cartog, Dresden, Germany
关键词
geospatial reference; geographic information retrieval; scale-free networks; cognition of geographic information; Wikipedia;
D O I
10.1080/13658816.2012.743664
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The aim of this article is to provide a basis in evidence for (or against) the much-quoted assertion that 80% of all information is geospatially referenced. For this purpose, two approaches are presented that are intended to capture the portion of geospatially referenced information in user-generated content: a network approach and a cognitive approach. In the network approach, the German Wikipedia is used as a research corpus. It is considered a network with the articles being nodes and the links being edges. The Network Degree of Geospatial Reference (NDGR) is introduced as an indicator to measure the network approach. We define NDGR as the shortest path between any Wikipedia article and the closest article within the network that is labeled with coordinates in its headline. An analysis of the German Wikipedia employing this approach shows that 78% of all articles have a coordinate themselves or are directly linked to at least one article that has geospatial coordinates. The cognitive approach is manifested by the categories of geospatial reference (CGR): direct, indirect, and non-geospatial reference. These are categories that may be distinguished and applied by humans. An empirical study including 380 participants was conducted. The results of both approaches are synthesized with the aim to (1) examine correlations between NDGR and the human conceptualization of geospatial reference and (2) to separate geospatial from non-geospatial information. From the results of this synthesis, it can be concluded that 5659% of the articles within Wikipedia can be considered to be directly or indirectly geospatially referenced. The article thus describes a method to check the validity of the 80%-assertion' for information corpora that can be modeled using graphs (e.g., the World Wide Web, the Semantic Web, and Wikipedia). For the corpus investigated here (Wikipedia), the 80%-assertion' cannot be confirmed, but would need to be reformulated as a 60%-assertion'.
引用
收藏
页码:1171 / 1189
页数:19
相关论文
共 35 条
  • [1] [Anonymous], GRUNDLAGEN GEOINFORM
  • [2] DBpedia - A crystallization point for the Web of Data
    Bizer, Christian
    Lehmann, Jens
    Kobilarov, Georgi
    Auer, Soeren
    Becker, Christian
    Cyganiak, Richard
    Hellmann, Sebastian
    [J]. JOURNAL OF WEB SEMANTICS, 2009, 7 (03): : 154 - 165
  • [3] Bollmann J., 2002, LEXIKON KARTOGRAPHIE, P266
  • [4] Graph structure in the Web
    Broder, A
    Kumar, R
    Maghoul, F
    Raghavan, P
    Rajagopalan, S
    Stata, R
    Tomkins, A
    Wiener, J
    [J]. COMPUTER NETWORKS-THE INTERNATIONAL JOURNAL OF COMPUTER AND TELECOMMUNICATIONS NETWORKING, 2000, 33 (1-6): : 309 - 320
  • [5] Preferential attachment in the growth of social networks: The internet encyclopedia Wikipedia
    Capocci, A.
    Servedio, V. D. P.
    Colaiori, F.
    Buriol, L. S.
    Donato, D.
    Leonardi, S.
    Caldarelli, G.
    [J]. PHYSICAL REVIEW E, 2006, 74 (03)
  • [6] CARDOSO N., 2011, SIGSPATIAL SPECIAL, V3, P46, DOI DOI 10.1145/2047296.2047307
  • [7] Dahinden T., 2011, ADV CARTOGRAPHY GISC, P471
  • [8] Fitzke J., 2010, ANGEW GEOINFORMATIK, P732
  • [9] Franklin C., 1992, DATABASE MAGAZINE DA, V15, P10
  • [10] FREKSA C, 1992, LECT NOTES COMPUT SC, V639, P162