Building efficient and effective metasearch engines

被引:160
作者
Meng, WY [1 ]
Yu, C
Liu, KL
机构
[1] SUNY Binghamton, Dept Comp Sci, Binghamton, NY 13902 USA
[2] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
[3] Depaul Univ, Sch Comp Sci Telecommun & Informat Syst, Chicago, IL 60604 USA
关键词
design; experimentation; performance; collection fusion; distributed collection; distributed information retrieval; information resource discovery; metasearch;
D O I
10.1145/505282.505284
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Frequently a user's information needs are stored in the databases of multiple search engines. It is inconvenient and inefficient for an ordinary user to invoke multiple search engines and identify useful documents from the returned results. To support unified access to multiple search engines, a metasearch engine can be constructed. When a metasearch engine receives a query from a user, it invokes the underlying search engines to retrieve useful information for the user. Metasearch engines have other benefits as a search tool such as increasing the search coverage of the Web and improving the scalability of the search. In this article, we survey techniques that have been proposed to tackle several underlying challenges for building a good metasearch engine. Among the main challenges, the database selection problem is to identify search engines that are likely to return useful documents to a given query. The document selection problem is to determine what documents to retrieve from each identified search engine. The result merging problem is to combine the documents returned from multiple search engines. We will also point out some problems that need to be further researched.
引用
收藏
页码:48 / 89
页数:42
相关论文
共 83 条
[1]  
ABDULLA G, 1997, TR9704
[2]  
[Anonymous], P 21 ANN INT ACM SIG
[3]  
[Anonymous], P AUSTR DAT C
[4]  
[Anonymous], 1996, P 19 ANN INT ACM SIG, DOI DOI 10.1145/243199.243202
[5]  
[Anonymous], 1998, Proceedings of the 7th international conference on World Wide Web (WWW), DOI [10.1016/S0169-7552(98)00110-X, DOI 10.1016/S0169-7552(98)00110-X]
[6]  
[Anonymous], P 18 INT ACM SIGIR C
[7]  
Baumgarten C, 1997, PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P258, DOI 10.1145/278459.258585
[8]  
Bergman M.K., 2000, DEEP WEB SURFACING H
[9]  
Boyan J, 1996, AAAI WORKSH INT BAS
[10]  
Buckley C., 1993, NIST SPECIAL PUBLICA, P59