Comparing rank and score combination methods for data fusion in information retrieval

被引:112
作者
Hsu D.F. [1 ,3 ]
Taksa I. [2 ]
机构
[1] Dept. of Comp. and Info. Science, Fordham University, New York, NY 10023, 113 West 60th Street
[2] Dept. of Stat. and Comp. Info. Syst., Baruch College, New York, NY 10010, One Bernard Baruch Way
[3] DIMACS Center, Rutgers University, Piscataway, NJ 08854-8018
来源
Information Retrieval | 2005年 / 8卷 / 3期
基金
美国国家科学基金会;
关键词
Cayley graphs and digraphs; Data fusion (DF); Evidence combinations; Information retrieval (IR); Multiple evidences; Permutation; Rank combination; Rank/score function; Score combination; Symmetric group;
D O I
10.1007/s10791-005-6994-4
中图分类号
学科分类号
摘要
Combination of multiple evidences (multiple query formulations, multiple retrieval schemes or systems) has been shown (mostly experimentally) to be effective in data fusion in information retrieval. However, the question of why and how combination should be done still remains largely unanswered. In this paper, we provide a model for simulation and a framework for analysis in the study of data fusion in the information retrieval domain. A rank/score function is defined and the concept of a Cayley graph is used in the design and analysis of our framework. The model and framework have led us to better understanding of the data fusion phenomena in information retrieval. In particular, by exploiting the graphical properties of the rank/score function, we have shown analytically and by simulation that combination using rank performs better than combination using score under certain conditions. Moreover, we demonstrated that the rank/score function might be used as a predictive variable for the effectiveness of combination of multiple evidences. © 2005 Springer Science + Business Media, Inc.
引用
收藏
页码:449 / 480
页数:31
相关论文
共 27 条
[1]  
Aslam J.A., Pavlu V., Savell R., A unified model for metasearch, pooling, and system evaluation, Proceedings of the Twelfth International Conference on Information and Knowledge Management, pp. 484-491, (2003)
[2]  
Belkin N.J., Cool C., Croft W.B., Callan J.P., The effect of multiple query representations on information retrieval performance, Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 339-346, (1993)
[3]  
Belkin N.J., Kantor P.B., Cool C., Quatrain R., Combining evidence for information retrieval, Proceedings of the Second Text Retrieval Conference, pp. 35-44, (1994)
[4]  
Belkin N.J., Kantor P.B., Fox E.A., Shaw J.A., Combining evidence of multiple query representation for information retrieval, Information Processing & Management, 31, 3, pp. 431-448, (1995)
[5]  
Biggs N.L., White T., Permutation groups and combinatorial structures, LMS Lecture Note Series, 33, (1979)
[6]  
Chuang H.-Y., Liu H., Chen F.-A., Kao C.-Y., Hsu D.F., Combination method in microarray analysis, Proceedings of the 7th International Symposium on Parallel Architectures, Algorithms and Networks (1-SPAN'04), pp. 625-630, (2004)
[7]  
Dwork C., Kumar R., Naor M., Sivakumar D., Rank aggregation methods for the web, Proceeding of WWW10, pp. 613-622, (2001)
[8]  
Fagin R., Kumar R., Sivakumar D., Comparing top k-lists, SIAM Journal on Discrete Mathematics., 17, pp. 134-160, (2003)
[9]  
Fox E.A., Shaw J.A., Combination of multiple searches, Proceedings of the Second Text Retrieval Conference (TREC-2), pp. 243-252, (1994)
[10]  
Grammatikakis M.D., Hsu D.F., Kraetzl M., Parallel System Interconnections and Communications, (2001)