Search engines and Web dynamics

被引:39
作者
Risvik, KM [1 ]
Michelsen, R [1 ]
机构
[1] Fast Search & Transfer ASA, NO-0120 Oslo, Norway
来源
COMPUTER NETWORKS-THE INTERNATIONAL JOURNAL OF COMPUTER AND TELECOMMUNICATIONS NETWORKING | 2002年 / 39卷 / 03期
关键词
dynamic information retrieval; indexing; document crawling; scalable architecture; algorithms; scheduling;
D O I
10.1016/S1389-1286(02)00213-X
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we study several dimensions of Web dynamics in the context of large-scale Internet search engines. Both growth and update dynamics clearly represent big challenges for search engines. We show how the problems arise in all components of a reference search engine model. Furthermore, we use the FAST Search Engine architecture as a case study for showing some possible solutions for Web dynamics and search engines. The focus is to demonstrate solutions that work in practice for real systems. The service is running live at www.alltheweb.com and major portals worldwide with more than 30 million queries a day, about 700 million full-text documents, a crawl base of 1.8 billion documents, updated every I I days, at a rate of 400 documents/second. We discuss future evolution of the Web, and some important issues for search engines will be scheduling and query execution as well as increasingly heterogeneous architectures to handle the dynamic Web. (C) 2002 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:289 / 302
页数:14
相关论文
共 17 条
[1]  
[Anonymous], 2000, DEEP WEB SURFACING H
[2]  
BRAY T, 1996, P 5 INT WORLD WID WE
[3]  
Brewington B.E., 2000, P 9 INT WORLD WID WE
[4]  
Brin S., 1998, 7 INT WORLD WIDE WEB
[5]  
Broder A., 2000, P 9 INT WORLD WID WE
[6]  
CHO J, 2000, P 2000 ACM INT C MAN
[7]  
CHO J, UNPUB ESTIMATING FRE
[8]  
CHO J, 2000, P 26 INT C VER LARG
[9]  
EDWARDS J, 2001, P 10 INT WORLD WID W
[10]  
Heydon A., 1999, WORLD WIDE WEB, V2