A methodology for collection selection in heterogeneous contexts

被引:3
作者
Abbaci, F [1 ]
Savoy, J [1 ]
Beigbeder, M [1 ]
机构
[1] Ecole Natl Super Mines, F-42023 St Etienne Du Rouvray, France
来源
INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: CODING AND COMPUTING, PROCEEDINGS | 2002年
关键词
information retrieval; distributed information retrieval; collection selection; results merging strategy; evaluation;
D O I
10.1109/ITCC.2002.1000443
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we demonstrate that in an ideal Distributed Information Retrieval environment, taking the ability of each collection server to return relevant documents into account when selecting collections can be effective. Based on this assumption, we suggest a new approach to resolve the collection selection problem. In order to predict a collection's ability to return relevant documents, we inspect a limited number n of documents retrieved from each collection and analyze the proximity of search keywords within them. In our experiments, we vary the underlying parameter n of our suggested model to define the most appropriate number of top documents to be inspected. Moreover, we evaluate the retrieval effectiveness of our approach and compare it with both the centralized indexing and the CORI approaches [1], [16]. Preliminary results from these experiments, conducted on WT10g test collection, tend to demonstrate that our suggested method can achieve appreciable retrieval effectiveness.
引用
收藏
页码:529 / 535
页数:7
相关论文
共 19 条
[1]  
CALLAN JP, P ACM SIGIR 1995, P21
[2]  
CHAKRAVARTHY AS, P ACM SIGIR 1995, P4
[3]  
Clarke C. L. A., 1995, P 4 TEXT RETR C TREC, P295
[4]  
CRASWELL N, P ACM DL 2000, P37
[5]  
FRENCH JC, P ACM SIGIR 1999, P238
[6]   GlOSS:: Text-source discovery over the Internet [J].
Gravano, L ;
García-Molina, H ;
Tomasic, A .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 1999, 24 (02) :229-264
[7]   Methods for information server selection [J].
Hawking, D ;
Thistlewaite, P .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1999, 17 (01) :40-76
[8]  
LARKEY LS, P ACM CIKM 2000, P282
[9]   Inquirus, the NECI meta search engine [J].
Lawrence, S ;
Giles, CL .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7) :95-105
[10]  
MOFFAT A, 1995, P TREC 3, P85