Higher-order web link analysis using multilinear algebra

被引:134
作者
Kolda, TG [1 ]
Bader, BW [1 ]
Kenny, JP [1 ]
机构
[1] Sandia Natl Labs, Livermore, CA 94550 USA
来源
FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2005年
关键词
D O I
10.1109/ICDM.2005.77
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Linear algebra is a powerful and proven tool in web search. Techniques, such as the PageRank algorithm of Brin and Page and the HITS algorithm of Kleinberg, score web pages based on the principal eigenvector (or singular vector) of a particular non-negative matrix that captures the hyperlink structure of the web graph. We propose and test a new methodology that uses multilinear algebra to elicit more information from a higher-order representation of the hyperlink graph. We start by labeling the edges in our graph with the anchor text of the hyperlinks so that the associated linear algebra representation is a sparse, three-way tensor. The first two dimensions of the tensor represent the web pages while the third dimension adds the anchor text. We then use the rank-I factors of a multilinear PARAFAC tensor decomposition, which are akin to singular vectors of the SVD, to automatically identify topics in the collection along with the associated authoritative web pages.
引用
收藏
页码:242 / 249
页数:8
相关论文
共 40 条
  • [1] Acar E, 2005, LECT NOTES COMPUT SC, V3495, P256
  • [2] [Anonymous], 2000, ICML
  • [3] BADER BW, 2004, UNPUB ACM T MATH SOF
  • [4] Using linear algebra for intelligent information retrieval
    Berry, MW
    Dumais, ST
    OBrien, GW
    [J]. SIAM REVIEW, 1995, 37 (04) : 573 - 595
  • [5] Bharat K., 1998, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P104, DOI 10.1145/290941.290972
  • [6] A measure of similarity between graph vertices: Applications to synonym extraction and web searching
    Blondel, VD
    Gajardo, A
    Heymans, M
    Senellart, P
    Van Dooren, P
    [J]. SIAM REVIEW, 2004, 46 (04) : 647 - 666
  • [7] The anatomy of a large-scale hypertextual Web search engine
    Brin, S
    Page, L
    [J]. COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7): : 107 - 117
  • [8] ANALYSIS OF INDIVIDUAL DIFFERENCES IN MULTIDIMENSIONAL SCALING VIA AN N-WAY GENERALIZATION OF ECKART-YOUNG DECOMPOSITION
    CARROLL, JD
    CHANG, JJ
    [J]. PSYCHOMETRIKA, 1970, 35 (03) : 283 - &
  • [9] Mining the web's link structure
    Chakrabarti, S
    Dom, BE
    Kumar, SR
    Raghavan, P
    Rajagopalan, S
    Tomkins, A
    Gibson, D
    Kleinberg, J
    [J]. COMPUTER, 1999, 32 (08) : 60 - +
  • [10] COHN D, 2001, ADV NEURAL INFORM PR, V13, P460