Measuring index quality using random walks on the Web

被引:29
作者
Henzinger, MR [1 ]
Heydon, A [1 ]
Mitzenmacher, M [1 ]
Najork, M [1 ]
机构
[1] Compaq Comp Corp Syst Res Ctr, Palo Alto, CA 94301 USA
关键词
search engines; index quality; random walks; PageRank;
D O I
10.1016/S1389-1286(99)00016-X
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recent research has studied how to measure the size of a search engine, in terms of the number of pages indexed. In this paper, we consider a different measure for search engines, namely the quality of the pages in a search engine index. We provide a simple, effective algorithm for approximating the quality of an index by performing a random walk on the Web, and we use this methodology to compare the index quality of several major search engines. (C) 1999 Published by Elsevier Science B.V. All rights reserved.
引用
收藏
页码:1291 / 1303
页数:13
相关论文
共 10 条
[1]  
[Anonymous], 1998, Proceedings of the 7th international conference on World Wide Web (WWW), DOI [10.1016/S0169-7552(98)00110-X, DOI 10.1016/S0169-7552(98)00110-X]
[2]  
BHARAT K, COMMUNICATION
[3]  
BHARAT K, 1998, P 7 INT WORLD WID WE, P467
[4]  
BHARAT K, 1998, P 7 INT WORLD WID WE, P379
[5]  
BRAY T, 1996, WORLD WIDE WEB J, V1
[6]  
CARRIERE J, 1997, P 6 INT WORLD WID WE, P701
[7]   Efficient crawling through URL ordering [J].
Cho, J ;
Garcia-Molina, H ;
Page, L .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7) :161-172
[8]   Searching the World Wide Web [J].
Lawrence, S ;
Giles, CL .
SCIENCE, 1998, 280 (5360) :98-100
[9]  
Page L., Technical Report
[10]  
SILVERSTEIN C, 1998, 1998014 COMP SYST RE