Latent entity space: a novel retrieval approach for entity-bearing queries

被引:42
作者
Liu, Xitong [1 ]
Fang, Hui [1 ]
机构
[1] Univ Delaware, Dept Elect & Comp Engn, Newark, DE 19716 USA
来源
INFORMATION RETRIEVAL JOURNAL | 2015年 / 18卷 / 06期
基金
美国国家科学基金会;
关键词
Latent entity space; Entity profile; Document retrieval;
D O I
10.1007/s10791-015-9267-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Analysis on Web search query logs has revealed that there is a large portion of entity-bearing queries, reflecting the increasing demand of users on retrieving relevant information about entities such as persons, organizations, products, etc. In the meantime, significant progress has been made in Web-scale information extraction, which enables efficient entity extraction from free text. Since an entity is expected to capture the semantic content of documents and queries more accurately than a term, it would be interesting to study whether leveraging the information about entities can improve the retrieval accuracy for entity-bearing queries. In this paper, we propose a novel retrieval approach, i.e., latent entity space (LES), which models the relevance by leveraging entity profiles to represent semantic content of documents and queries. In the LES, each entity corresponds to one dimension, representing one semantic relevance aspect. We propose a formal probabilistic framework to model the relevance in the high-dimensional entity space. Experimental results over TREC collections show that the proposed LES approach is effective in capturing latent semantic content and can significantly improve the search accuracy of several state-of-the-art retrieval models for entity-bearing queries.
引用
收藏
页码:473 / 503
页数:31
相关论文
共 57 条
  • [1] [Anonymous], 2011, P TREC
  • [2] [Anonymous], 2010, Proceedings of TREC
  • [3] [Anonymous], 2009, TREC
  • [4] [Anonymous], 2013, FACC1 FREEBASE ANNOT
  • [5] Balog K., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P43, DOI 10.1145/1148170.1148181
  • [6] Balog K., 2011, P TREC
  • [7] Banko M, 2007, 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2670
  • [8] Bendersky Michael, 2008, P 31 ANN INT ACM SIG, P491, DOI DOI 10.1145/1390334.1390419
  • [9] Billerbeck Bodo, 2004, P ADC, P69
  • [10] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022