Supporting web query expansion efficiently using multi-granularity indexing and query processing

被引:12
作者
Li, WS [1 ]
Agrawal, D [1 ]
机构
[1] NEC USA, C&C Res Labs, San Jose, CA 95134 USA
关键词
query expansion; information retrieval; world-wide web; indexing; query processing; lexical-semantics; co-occurrence; progressive processing;
D O I
10.1016/S0169-023X(00)00024-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of word mismatch in information retrieval (IR) occurs because users often use different words to describe concepts in their queries than authors use to describe the same concepts in their documents. Query expansion is used to deal with the mismatch between author and user vocabularies. To support query expansion, indices on words related by lexical semantics and syntactical co-occurrence need to be maintained. Two issues become paramount in supporting query expansion: the size of index tables and the query processing overhead. In this paper, we propose to use the notion of multi-granularity for more efficient indexing and query processing while the same degrees of precision and recall are maintained. We also describes extensions of this technique to handle: (1) query relaxation to handle words with multiple senses and with other semantic relationships; (2) progressive processing of queries with top N results and (3) progressive processing of queries with specification of the importance of each keyword. (C) 2000 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:239 / 257
页数:19
相关论文
共 25 条
[1]  
[Anonymous], P 16 ANN INT ACM SIG
[2]  
[Anonymous], SMART SIRE EXPT RETE
[3]  
BERRY MW, 1995, P 1995 ACM C SUP
[4]  
BUCKLEY C, 1993, P 3 TEXT RETR C GAIT
[5]   TREC AND TIPSTER EXPERIMENTS WITH INQUERY [J].
CALLAN, JP ;
CROFT, WB ;
BROGLIO, J .
INFORMATION PROCESSING & MANAGEMENT, 1995, 31 (03) :327-343
[6]  
CROFT B, 1995, P DIG LIB DL 95
[7]  
CROFT WB, 1994, P 4 ANN S
[8]  
DATTOLA RT, 1971, SMART RETRIEVAL SYST, pCH12
[9]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
[10]  
2-9