DBpedia-Based Entity Linking via Greedy Search and Adjusted Monte Carlo RandomWalk

被引:17
作者
Liu, Ming [1 ]
Chen, Lei [2 ]
Liu, Bingquan [1 ]
Zheng, Guidong [1 ]
Zhang, Xiaoming [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Heilongjiang, Peoples R China
[2] Beijing Normal Univ, Int Business Fac, Zhuhai, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Collective entity linking; pagerank; adjusted monte carlo random walk; greedy search; multi-core; PAGERANK;
D O I
10.1145/3086703
中图分类号
TP [自动化技术、计算机技术];
学科分类号
080201 [机械制造及其自动化];
摘要
Facing a large amount of entities appearing on the web, entity linking has recently become useful. It assigns an entity from a resource to one name mention to help users grasp the meaning of this name mention. Unfortunately, many possible entities can be assigned to one name mention. Apparently, the usually co-occurring name mentions are related and can be considered together to determine their best assignments. This approach is called collective entity linking and is often conducted based on entity graph. However, traditional collective entity linking methods either consume much time due to the large scale of entity graph or obtain low accuracy due to simplifying graph. To improve both accuracy and efficiency, this article proposes a novel collective entity linking algorithm. It first constructs an entity graph by connecting any two related entities, and then a probability-based objective function is proposed on this graph to ensure the high accuracy of the linking result. Via this function, we convert entity linking to the process of finding the nodes with the highest PageRank Values. Greedy search and an adjusted Monte Carlo random walk are proposed to fulfill this work. Experimental results demonstrate that our algorithm performs much better than traditional linking methods.
引用
收藏
页数:34
相关论文
共 39 条
[1]
Lower Bound on Expected Complexity of Depth-First Tree Search with Multiple Radii [J].
Ahn, Junil ;
Kim, Kiseon .
IEEE COMMUNICATIONS LETTERS, 2012, 16 (06) :805-808
[2]
Alhelbawy A, 2014, PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, P75
[3]
[Anonymous], 2010, P 2010 C N AM CHAPT
[4]
[Anonymous], 2011, P 2011 C EMPIRICAL M, DOI DOI 10.3115/V1/D11-1072
[5]
[Anonymous], 2007, EMNLP CONLL, DOI DOI 10.1145/2187836.2187900
[6]
[Anonymous], 2011, 22 INT JOINT C ART I, DOI DOI 10.5591/978-1-57735-516-8/IJCAI11-385
[7]
[Anonymous], 2012, P 21 ACM INT C INFOR
[8]
Monte Carlo methods in pagerank computation: When one iteration is sufficient [J].
Avrachenkov, K. ;
Litvak, N. ;
Nemirovsky, D. ;
Osipova, N. .
SIAM JOURNAL ON NUMERICAL ANALYSIS, 2007, 45 (02) :890-904
[9]
Swoosh: a generic approach to entity resolution [J].
Benjelloun, Omar ;
Garcia-Molina, Hector ;
Menestrina, David ;
Su, Qi ;
Whang, Steven Euijong ;
Widom, Jennifer .
VLDB JOURNAL, 2009, 18 (01) :255-276
[10]
Fast and Space-Efficient Entity Linking in Queries [J].
Blanco, Roi ;
Ottaviano, Giuseppe ;
Meij, Edgar .
WSDM'15: PROCEEDINGS OF THE EIGHTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2015, :179-188