Finding related pages in the World Wide Web

被引:107
作者
Dean, J [1 ]
Henzinger, MR [1 ]
机构
[1] Compaq Syst Res Ctr, Palo Alto, CA 94301 USA
关键词
search engines; related pages; searching paradigms;
D O I
10.1016/S1389-1286(99)00022-5
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
When using traditional search engines, users have to formulate queries to describe their information need. This paper discusses a different approach to Web searching where the input to the search process is not a set of query terms, but instead is the URL of a page, and the output is a set of related Web pages. A related Web page is one that addresses the same topic as the original page. For example, www.washingtonpost.com is a page related to www.nytimes.com, since both are online newspapers. We describe two algorithms to identify related Web pages. These algorithms use only the connectivity information in the Web (i.e., the links between pages) and not the content of pages or usage information. We have implemented both algorithms and measured their runtime performance. To evaluate the effectiveness of our algorithms, we performed a user study comparing our algorithms with Netscape's 'What's Related' service (http://home.netscape.com/escapes/related/). Our study showed that the precision at 10 for our two algorithms are 73% better and 51% better than that of Netscape, despite the fact that Netscape uses both content and usage pattern information in addition to connectivity information. (C) 1999 Published by Elsevier Science B.V. All rights reserved.
引用
收藏
页码:1467 / 1479
页数:13
相关论文
共 22 条
  • [1] [Anonymous], P 1995 C HUM FACT CO
  • [2] [Anonymous], INTRO STAT
  • [3] [Anonymous], 1998, P 1998 ACM SIGMOD IN
  • [4] [Anonymous], P 6 INT WORLD WID WE
  • [5] [Anonymous], 1998, Proceedings of the 7th international conference on World Wide Web (WWW), DOI [10.1016/S0169-7552(98)00110-X, DOI 10.1016/S0169-7552(98)00110-X]
  • [6] [Anonymous], P HUM FACT COMP SYST
  • [7] [Anonymous], P ACM SIGCHI C HUM F
  • [8] Bharat K., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P104, DOI 10.1145/290941.290972
  • [9] BHARAT K, 1998, P 7 INT WORLD WID WE, P469
  • [10] CARRIERE J, 1998, P 6 INT WORLD WID WE, P701