Lexical and Syntactic knowledge for Information Retrieval

被引:12
作者
Ferrandez, Antonio [1 ]
机构
[1] Univ Alicante, Dept Languages & Informat Syst, E-03080 Alicante, Spain
关键词
Information Retrieval; Natural Language Processing; Term Proximity; Question Answering; Lexical and syntactic relationships; QUERY EXPANSION; TERM PROXIMITY; MODEL;
D O I
10.1016/j.ipm.2011.01.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Traditional Information Retrieval (IR) models assume that the index terms of queries and documents are statistically independent of each other, which is intuitively wrong. This paper proposes the incorporation of the lexical and syntactic knowledge generated by a POS-tagger and a syntactic Chunker into traditional IR similarity measures for including this dependency information between terms. Our proposal is based on theories of discourse structure by means of the segmentation of documents and queries into sentences and entities. Therefore, we measure dependencies between entities instead of between terms. Moreover, we handle discourse references for each entity. It has been evaluated on Spanish and English corpora as well as on Question Answering tasks obtaining significant increases. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:692 / 705
页数:14
相关论文
共 46 条
[1]  
Alonso MA, 2002, LECT NOTES ARTIF INT, V2464, P3
[2]  
Amati G, 2003, LECT NOTES COMPUT SC, V3237, P310
[3]  
[Anonymous], P 16 ANN INT ACM SIG
[4]  
[Anonymous], 2009, WORKSHOP CROSS LANGU
[5]  
[Anonymous], 1993, P ACM SIGIR C, DOI DOI 10.1145/160688.160715
[6]  
ARAMPATZIS A, 2000, ENCY LIB INFORMATION, V69, P201
[7]  
BUSCALDI D, 2009, J INTELL INF SYST, V34, P13
[8]  
Buttcher S., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P621, DOI 10.1145/1148170.1148285
[9]  
BYUNGKWAN K, 2000, P ACL 2000 WORKSH RE, P57
[10]  
Cho BH, 2003, INFORM PROCESS MANAG, V39, P505, DOI 10.1016/SO306-4573(02)00078-X