Collection-integral source selection for uncooperative distributed information retrieval environments

被引:14
作者
Paltoglou, Georgios [1 ]
Salampasis, Michail [2 ]
Satratzemi, Maria [1 ]
机构
[1] Univ Macedonia, Thessaloniki, Greece
[2] Alexander Technol Educ Inst Thessaloniki, Thessaloniki 57400, Greece
关键词
Source selection; Distributed information retrieval; Federated search; SEARCH; WEB; PERFORMANCE;
D O I
10.1016/j.ins.2010.03.020
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a new integral-based source selection algorithm for uncooperative distributed information retrieval environments. The algorithm functions by modeling each source as a plot, using the relevance score and the intra-collection position of its sampled documents in reference to a centralized sample index. Based on the above modeling, the algorithm locates the collections that contain the most relevant documents. A number of transformations are applied to the original plot, in order to reward collections that have higher scoring documents and dampen the effect of collections returning an excessive number of documents. The family of linear interpolant functions that pass through the points of the modified plot is computed for each available source and the area that they cover in the rank-relevance space is calculated. Information sources are ranked based on the area that they cover. Based on this novel metric for collection relevance, the algorithm is tested in a variety of testbeds in both recall and precision oriented settings and its performance is found to be better or at least equal to previous state-of-the-art approaches, overall constituting a very effective and robust solution. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:2763 / 2776
页数:14
相关论文
共 61 条
[41]  
RAGHAVAN S, 2001, VLDB, P129
[42]  
ROBERTSON S, 1994, TREC 3
[43]   On the history of evaluation in IR [J].
Robertson, Stephen .
JOURNAL OF INFORMATION SCIENCE, 2008, 34 (04) :439-456
[44]  
SHOKOUHI M, 2006, SIGIR 06, P316
[45]  
Shokouhi M, 2007, LECT NOTES COMPUT SC, V4425, P160
[46]   Robust Result Merging Using Sample-Based Score Estimates [J].
Shokouhi, Milad ;
Zobel, Justin .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2009, 27 (03)
[47]  
Si L., 2005, SIGIR 2005. Proceedings of the Twenty-Eighth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P83, DOI 10.1145/1076034.1076051
[48]  
Si L., 2002, Proceedings of the Eleventh International Conference on Information and Knowledge Management. CIKM 2002, P391, DOI 10.1145/584792.584856
[49]   A semisupervised learning method to merge search engine results [J].
Si, L ;
Callan, J .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2003, 21 (04) :457-491
[50]  
Si L., 2003, P 26 ANN INT ACM SIG, P298, DOI [10.1145/860435.860490, DOI 10.1145/860435.860490]