The stochastic approach for link-structure analysis (SALSA) and the TKC effect

被引:177
作者
Lempel, R [1 ]
Moran, S [1 ]
机构
[1] Technion Israel Inst Technol, Dept Comp Sci, IL-32000 Haifa, Israel
来源
COMPUTER NETWORKS-THE INTERNATIONAL JOURNAL OF COMPUTER AND TELECOMMUNICATIONS NETWORKING | 2000年 / 33卷 / 1-6期
关键词
information retrieval; link structure analysis; hubs and authorities; random walks; SALSA;
D O I
10.1016/S1389-1286(00)00034-7
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Today, when searching for information on the World Wide Web, one usually performs a query through a term-based search engine. These engines return, as the query's result, a list of Web sites whose contents match the query. For broad topic queries, such searches often result in a huge set of retrieved documents, many of which are irrelevant to the user. However, much information is contained in the link-structure of the World Wide Web. Information such as which pages are linked to others can be used to augment search algorithms. In this context, Jon Kleinberg introduced the notion of two distinct types of Web sites: hubs and authorities. Kleinberg argued that hubs and authorities exhibit a mutually reinforcing relationship: a good hub will point to many authorities, and a good authority will be pointed at by many hubs. In light of this, he devised an algorithm aimed at finding authoritative sites. We present SALSA, a new stochastic approach for link structure analysis, which examines random walks on graphs derived from the link structure. We show that both SALSA and Kleinberg's mutual reinforcement approach employ the same meta-algorithm. We then prove that SALSA is equivalent to a weighted in-degree analysis of the link-structure of World Wide Web subgraphs, making it computationally more efficient than the mutual reinforcement approach. We compare the results of applying SALSA to the results derived through kleinberg's approach. These comparisons reveal a topological phenomenon called the TKC effect (Tightly Knit Community) which, in certain cases, prevents the mutual reinforcement approach from identifying meaningful authorities. (C) 2000 Published by Elsevier Science B.V. All rights reserved.
引用
收藏
页码:387 / 401
页数:15
相关论文
共 24 条
[1]  
[Anonymous], P ACM SIGCHI C HUM F
[2]  
[Anonymous], P 7 INT WWW C
[3]  
[Anonymous], P 9 ACM C HYP HYP
[4]  
[Anonymous], 1998, P ACM SIAM S DISCR A
[5]   AN ANALYSIS OF SOME GRAPH THEORETICAL CLUSTER TECHNIQUES [J].
AUGUSTSO.JG ;
MINKER, J .
JOURNAL OF THE ACM, 1970, 17 (04) :571-&
[6]  
CARRIERE J, 1997, P 6 INT WWW C
[7]  
CHAKRABARTI S, 1998, P 7 INT WWW C
[8]  
CHAKRABARTI S, 1998, ACM SIGIR WORKSH HYP
[9]  
CHAKRABARTI S, 1999, SCI AM JUN
[10]  
CHAKRABARTI S, 1999, IEEE COMP AUG