Opening practice: supporting reproducibility and critical spatial data science

被引:63
作者
Brunsdon, Chris [1 ]
Comber, Alexis [2 ]
机构
[1] Maynooth Univ, Natl Ctr Geocomputat, Maynooth, Kildare, Ireland
[2] Univ Leeds, Sch Geog, Leeds, W Yorkshire, England
基金
英国自然环境研究理事会;
关键词
Critical data science; Open source; GIScience; Geocomputation; LAND-COVER; SOFTWARE;
D O I
10.1007/s10109-020-00334-2
中图分类号
P9 [自然地理学]; K9 [地理];
学科分类号
0705 ; 070501 ;
摘要
This paper reflects on a number of trends towards a more open and reproducible approach to geographic and spatial data science over recent years. In particular, it considers trends towards Big Data, and the impacts this is having onspatialdata analysis and modelling. It identifies a turn in academia towards coding as a core analytic tool, and away from proprietary software tools offering 'black boxes' where the internal workings of the analysis are not revealed. It is argued that this closed form software is problematic and considers a number of ways in which issues identified in spatial data analysis (such as the MAUP) could be overlooked when working with closed tools, leading to problems of interpretation and possibly inappropriate actions and policies based on these. In addition, this paper considers the role that reproducible and open spatial science may play in such an approach, taking into account the issues raised. It highlights the dangers of failing to account for the geographical properties of data, now that all data are spatial (they are collected somewhere), the problems of a desire for n = all observations in data science and it identifies the need for a critical approach. This is one in which openness, transparency, sharing and reproducibility provide a mantra for defensible and robust spatial data science.
引用
收藏
页码:477 / 496
页数:20
相关论文
共 71 条
  • [1] [Anonymous], 2001, Reproducible research. the bottom line
  • [2] [Anonymous], 2015, RStudio: integrated development for R, V42, P14
  • [3] GeoDa:: An introduction to spatial data analysis
    Anselin, L
    Syabri, I
    Kho, Y
    [J]. GEOGRAPHICAL ANALYSIS, 2006, 38 (01) : 5 - 22
  • [4] Publish your computer code: it is good enough
    Barnes, Nick
    [J]. NATURE, 2010, 467 (7317) : 753 - 753
  • [5] Baumer B, 2014, ARXIV14021894
  • [6] Think Your Artificial Intelligence Software Is Fair? Think Again
    Bellamy, Rachel K. E.
    Dey, Kuntal
    Hind, Michael
    Hoffman, Samuel C.
    Houde, Stephanie
    Kannan, Kalapriya
    Lohia, Pranay
    Mehta, Sameep
    Mojsilovic, Aleksandra
    Nagar, Seema
    Ramamurthy, Karthikeyan Natesan
    Richards, John
    Saha, Diptikalyan
    Sattigeri, Prasanna
    Singh, Moninder
    Varshney, Kush R.
    Zhang, Yunfeng
    [J]. IEEE SOFTWARE, 2019, 36 (04) : 76 - 80
  • [7] APPROACHES TO REGIONAL-ANALYSIS - A SYNTHESIS
    BERRY, BJL
    [J]. ANNALS OF THE ASSOCIATION OF AMERICAN GEOGRAPHERS, 1964, 54 (01) : 2 - 11
  • [8] Implementing spatial data analysis software tools in R
    Bivand, R
    [J]. GEOGRAPHICAL ANALYSIS, 2006, 38 (01) : 23 - 40
  • [9] Bivand RS, 2008, USE R, P1
  • [10] Geographically weighted regression: A method for exploring spatial nonstationarity
    Brunsdon, C
    Fotheringham, AS
    Charlton, ME
    [J]. GEOGRAPHICAL ANALYSIS, 1996, 28 (04) : 281 - 298