A Web Aggregation Approach for Distributed Randomized PageRank Algorithms

被引:49
作者
Ishii, Hideaki [1 ]
Tempo, Roberto [2 ]
Bai, Er-Wei [3 ,4 ]
机构
[1] Tokyo Inst Technol, Dept Computat Intelligence & Syst Sci, Yokohama, Kanagawa 2268502, Japan
[2] Politecn Torino, CNR IEIIT, I-10129 Turin, Italy
[3] Univ Iowa, Dept Elect & Comp Engn, Iowa City, IA 52242 USA
[4] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Belfast BT7 1NN, Antrim, North Ireland
关键词
Aggregation; distributed computation; multi-agent consensus; PageRank algorithm; randomization; search engines; stochastic matrices; MARKOV-CHAINS; RANDOM NETWORKS; MONTE-CARLO; COMPUTATION; CONSENSUS; EIGENVECTOR; SUFFICIENT; ITERATION; SYSTEMS; MATRIX;
D O I
10.1109/TAC.2012.2190161
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The PageRank algorithm employed at Google assigns a measure of importance to each web page for rankings in search results. In our recent papers, we have proposed a distributed randomized approach for this algorithm, where web pages are treated as agents computing their own PageRank by communicating with linked pages. This paper builds upon this approach to reduce the computation and communication loads for the algorithms. In particular, we develop a method to systematically aggregate the web pages into groups by exploiting the sparsity inherent in the web. For each group, an aggregated PageRank value is computed, which can then be distributed among the group members. We provide a distributed update scheme for the aggregated PageRank along with an analysis on its convergence properties. The method is especially motivated by results on singular perturbation techniques for large-scale Markov chains and multi-agent consensus. A numerical example is provided to illustrate the level of reduction in computation while keeping the error in rankings small.
引用
收藏
页码:2703 / 2717
页数:15
相关论文
共 52 条
[1]   AGGREGATION OF THE POLICY ITERATION METHOD FOR NEARLY COMPLETELY DECOMPOSABLE MARKOV-CHAINS [J].
ALDHAHERI, RW ;
KHALIL, HK .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1991, 36 (02) :178-187
[2]  
Andersen R, 2007, LECT NOTES COMPUT SC, V4863, P150
[3]  
[Anonymous], 2007, IEEE CONTROL SYS AUG, V27
[4]  
[Anonymous], 2006, Google's PageRank and beyond: the science of search engine rankings
[5]  
[Anonymous], 2009, MARKOV CHAINS STOCHA
[6]  
[Anonymous], 2007, P IEEE
[7]   Monte Carlo methods in pagerank computation: When one iteration is sufficient [J].
Avrachenkov, K. ;
Litvak, N. ;
Nemirovsky, D. ;
Osipova, N. .
SIAM JOURNAL ON NUMERICAL ANALYSIS, 2007, 45 (02) :890-904
[8]  
Bertsekas D.P., 1989, PARALLEL DISTRIBUTED
[9]   Area aggregation and time-scale modeling for sparse nonlinear networks [J].
Biyik, Emrah ;
Arcak, Murat .
SYSTEMS & CONTROL LETTERS, 2008, 57 (02) :142-149
[10]   Randomized gossip algorithms [J].
Boyd, Stephen ;
Ghosh, Arpita ;
Prabhakar, Balaji ;
Shah, Devavrat .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (06) :2508-2530