Improving the effectiveness of information retrieval with local context analysis

被引:301
作者
Xu, JX
Croft, WB
机构
[1] BBN Technol, Cambridge, MA 02138 USA
[2] Univ Massachusetts, Dept Comp Sci, Amherst, MA 01003 USA
基金
英国医学研究理事会;
关键词
experimentation; performance; cooccurrence; document analysis; feedback; global techniques; information retrieval; local context analysis; local techniques;
D O I
10.1145/333135.333138
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Techniques for automatic query expansion have been extensively studied in information retrieval research as a means of addressing the word mismatch between queries and documents. These techniques can be categorized as either global or local. While global techniques rely on analysis of a whole collection to discover word relationships, local techniques emphasize analysis of the top-ranked documents retrieved for a query. While local techniques have shown to be more effective than global techniques in general, existing local techniques are not robust and can seriously hurt retrieval when few of the retrieved documents are relevant. We propose a new technique, called local context analysis, which selects expansion terms based on cooccurrence with the query terms within the top-ranked documents. Experiments on a number of collections, both English and non-English, show that local context analysis offers more effective and consistent retrieval results.
引用
收藏
页码:79 / 112
页数:34
相关论文
共 47 条
[21]  
HEARST MA, 1996, P 19 ANN INT ACM SIG, P76
[22]  
JING Y, 1994, P RIAO, P146
[23]  
Jones KS, 1971, Automatic keyword classification for information retrieval
[24]  
KWOK KL, 1996, P 19 ANN INT ACM SIG, P187
[25]  
KWOK KL, 1998, NIST SPECIAL PUBLICA, P207
[26]  
LU A, 1997, NIST SPECIAL PUB, P229
[27]  
MINKER J, 1972, INFORM STORAGE RET, V8, P329, DOI 10.1016/0020-0271(72)90021-6
[28]  
Mitra M., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P206, DOI 10.1145/290941.290995
[29]  
PONTE J, 1996, P S DOC AN INF RETR
[30]  
Ponte Jay M., 1998, ACM SIGIR FORUM, P275, DOI [DOI 10.1145/290941.291008, 10.1145/290941.291008]