Topic-sensitive PageRank: A context-sensitive ranking algorithm for Web search

被引:547
作者
Haveliwala, TH [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
关键词
Web search; web graph; link analysis; PageRank; search in context; personalized search; ranking algorithm;
D O I
10.1109/TKDE.2003.1208999
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The original PageRank algorithm for improving the ranking of search-query results computes a single vector, using the link structure of the Web, to capture the relative "importance" of Web pages, independent of any particular search query. To yield more accurate search results, we propose computing a set of PageRank vectors, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic. For ordinary keyword search queries, we compute the topic-sensitive PageRank scores for pages satisfying the query using the topic of the query keywords. For searches done in context (e.g., when the search query is performed by highlighting words in a Web page), we compute the topic-sensitive PageRank scores using the topic of the context in which the query appeared. By using linear combinations of these (precomputed) biased PageRank vectors to generate context-specific importance scores for pages at query time, we show that we can generate more accurate rankings than with a single, generic PageRank vector. We describe techniques for efficiently implementing a large-scale search system based on the topic-sensitive PageRank scheme.
引用
收藏
页码:784 / 796
页数:13
相关论文
共 30 条
[1]  
[Anonymous], 2003, P 12 INT WORLD WID W
[2]  
Bell T. C., 1999, Managing Gigabytes, V2nd ed
[3]  
BHARAT K, 2001, P 10 INT WORLD WID W
[4]  
BHARAT K, 1998, P ACM SIGIR
[5]  
BRIN S, 1998, B IEEE CS TECHNICAL
[6]  
Brin S., 1998, 7 INT WORLD WIDE WEB
[7]  
Chakrabarti S., 1998, P 7 INT WORLD WID WE
[8]  
CHAKRABARTI S, 2002, P 11 INT WORLD WID W
[9]  
Chakrabarti Soumen., 2002, Mining the Web: Discovering Knowledge from Hypertext Data
[10]  
DILIGENTI M, 2002, P 11 INT WORLD WID W