Can Google's PageRank be used to find the most important academic Web pages?

被引:18
作者
Thelwall, M [1 ]
机构
[1] Wolverhampton Univ, Sch Comp & Informat Technol, Wolverhampton, England
关键词
Internet; universities; information retrieval; algorithms; effectiveness;
D O I
10.1108/00220410310463491
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Google's PageRank is an influential algorithm that uses a model of Web use that is dominated by its link structure in order to rank pages by their estimated value to the Web community. This paper reports on the outcome of applying the algorithm to the Web sites of three national university systems in order to test whether it is capable of identifying the most important Web pages. The results are also compared with simple inlink counts. It was discovered that the highest inlinked pages do not always have the highest PageRank, indicating that the two metrics are genuinely different, even for the top pages. More significantly, however, internal links dominated external links for the high ranks in either method and superficial reasons accounted for high scores in both cases. It is concluded that PageRank is not useful for identifying the top pages in a site and that it must be combined with a powerful text matching techniques in order to get the quality of information retrieval results provided by Google.
引用
收藏
页码:205 / 217
页数:13
相关论文
共 32 条
[1]  
[Anonymous], [No title captured], Patent No. 6285999
[2]  
[Anonymous], PUBLICLY ACCESSIBLE
[3]  
[Anonymous], 1997, CYBERMETRICS
[4]  
BHARAT K, 2001, 10 INT WORLD WID WEB
[5]   The anatomy of a large-scale hypertextual Web search engine [J].
Brin, S ;
Page, L .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7) :107-117
[6]   Graph structure in the Web [J].
Broder, A ;
Kumar, R ;
Maghoul, F ;
Raghavan, P ;
Rajagopalan, S ;
Stata, R ;
Tomkins, A ;
Wiener, J .
COMPUTER NETWORKS-THE INTERNATIONAL JOURNAL OF COMPUTER AND TELECOMMUNICATIONS NETWORKING, 2000, 33 (1-6) :309-320
[7]  
GAO J, 2001, TREC10 WEB TRACK EXP
[8]   Integrating scientometric indicators into sociological studies:: methodical and methodological problems [J].
Gläser, J ;
Laudel, G .
SCIENTOMETRICS, 2001, 52 (03) :411-434
[9]   Scholarly publishing in the Internet age: a citation analysis of computer science literature [J].
Goodrum, AA ;
McCain, KW ;
Lawrence, S ;
Giles, CL .
INFORMATION PROCESSING & MANAGEMENT, 2001, 37 (05) :661-675
[10]  
GOOGLE, 2002, GOOGL TECHN