ADAPTING THE EDINBURGH GEOPARSER FOR HISTORICAL GEOREFERENCING

被引:36
作者
Alex, Beatrice [1 ]
Byrne, Kate [1 ]
Grover, Claire [1 ]
Tobin, Richard [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh EH8 9YL, Midlothian, Scotland
来源
INTERNATIONAL JOURNAL OF HUMANITIES AND ARTS COMPUTING-A JOURNAL OF DIGITAL HUMANITIES | 2015年 / 9卷 / 01期
关键词
Georeferencing; georesolution; text mining; domain adaptation;
D O I
10.3366/ijhac.2015.0136
中图分类号
C [社会科学总论];
学科分类号
03 ; 0303 ;
摘要
Place name mentions in text may have more than one potential referent (e.g. Peru, the country vs. Peru, the city in Indiana). The Edinburgh Language Technology Group (LTG) has developed the Edinburgh Geoparser, a system that can automatically recognise place name mentions in text and disambiguate them with respect to a gazetteer. The recognition step is required to identify location mentions in a given piece of text. The subsequent disambiguation step, generally referred to as georesolution, grounds location mentions to their corresponding gazetteer entries with latitude and longitude values, for example, to visualise them on a map. Geoparsing is not only useful for mapping purposes but also for making document collections more accessible as it can provide additional metadata about the geographical content of documents. Combined with other information mined from text such as person names and date expressions, complex relations between such pieces of information can be identified. The Edinburgh Geoparser can be used with several gazetteers including Unlock and GeoNames to process a variety of input texts. The original version of the Geoparser was a demonstrator configured for modern text. Since then, it has been adapted to georeference historic and ancient text collections as well as modern-day newspaper text. 1,2,3,4 Currently, the LTG is involved in three research projects applying the Geoparser to historical text collections of very different types and for a variety of end-user applications. This paper discusses the ways in which we have customised the Geoparser for specific datasets and applications relevant to each project.
引用
收藏
页码:15 / 35
页数:21
相关论文
共 20 条
[1]  
Alex B., 2014, P 8 LING ANN WORKSH
[2]  
Alex B., 2010, P WORKSH COMP LING W
[3]  
Alex B., 2014, P 1 INT C DIG ACC TE, P97, DOI DOI 10.1145/2595188.2595214
[4]  
Barker E., 2012, NEDIMAH WORKSH DIG H
[5]  
Barker Elton, 2010, LEEDS INT CLASSICAL, V9, P1
[6]  
Byrne K., 2011, GAP PROJECT BLO 0418
[7]  
Curran JR, 2003, EACL 2003: 10TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P91
[8]  
D'Ignazio C., 2014, P NEWSKDD 2014 NEW Y
[9]  
Farrer W., 1923, RECORDS RELATING BAR, V1
[10]  
Grover C., 2014, P EACL LATECH WORKSH, P119