An evaluation dataset for the toponym resolution task

被引:16
作者
Leidner, Jochen L. [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh EH8 9LW, Midlothian, Scotland
关键词
natural language processing; place names; toponym resolution; spatial grounding; geocoding; extensional referent disambiguation; evaluation;
D O I
10.1016/j.compenvurbsys.2005.07.003
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Toponym resolution is the task of linking place name instances in a text with spatial footprints, given the context in which they occur. Whereas a lot of work on the evaluation of temporal resolution is ongoing (e.g. [Setzer, A., & Gaizauskas, R. (2000). On the importance of annotating temporal event-event relations in text. In LREC 2000 Workshop on annotation standards for temporal information in natural language, Vol. 3 (pp. 1281-1286). Athens, Greece]), to date no reference resource is available to evaluate competing algorithms for toponym resolution. It is thus argued that a shareable, reusable evaluation resource is necessary. To this end, a new proposal for the markup of toponyms in text corpora with their referents and an associated tool data methodology are presented: the Toponyrn Resolution Markup Language (TRML) is an XML-based markup language, and TAME, the toponyrn annotation markup editor, is a tool that implements it. A novel evaluation resource is described which comprises a large-scale reference gazetteer server and a human-annotated news corpus in which toponyms are associated with latitude/longitude coordinates of the location they refer to. The reliability of the annotation task is established by determining inter-annotator agreement of the human annotators. (c) 2005 Elsevier Ltd. All rights reserved.
引用
收藏
页码:400 / 417
页数:18
相关论文
共 13 条
[1]   Introduction to the special issue on SENSEVAL [J].
Kilgarriff, A ;
Palmer, M .
COMPUTERS AND THE HUMANITIES, 2000, 34 (1-2) :1-13
[2]  
Leidner JL, 2004, P 27 ANN INT C RES D, DOI DOI 10.1145/1008992.1009147
[3]  
Lewis DD, 2004, J MACH LEARN RES, V5, P361
[4]  
MAKKONEN J, 2003, P 25 EUR C INF RETR, P251
[5]  
Smith D. A., 2001, Research and Advanced Technology for Digital Libraries. 5th European Conference, ECDL 2001. Proceedings (Lecture Notes in Computer Science Vol.2163), P127
[6]  
[No title captured]
[7]  
[No title captured]
[8]  
[No title captured]
[9]  
[No title captured]
[10]  
[No title captured]