Document Categorization and Query Generation on the World Wide Web Using WebACE

被引:7
作者
Daniel Boley
Maria Gini
Robert Gross
Eui-Hong (Sam) Han
Kyle Hastings
George Karypis
Vipin Kumar
Bamshad Mobasher
Jerome Moore
机构
[1] University of Minnesota,Department of Computer Science and Engineering
来源
Artificial Intelligence Review | 1999年 / 13卷
关键词
clustering; divisive partitioning; graph partitioning; principal component analysis; web documents;
D O I
暂无
中图分类号
学科分类号
摘要
We present WebACE, an agent for exploring and categorizing documents onthe World Wide Web based on a user profile. The heart of the agent is anunsupervised categorization of a set of documents, combined with a processfor generating new queries that is used to search for new relateddocuments and for filtering the resulting documents to extract the onesmost closely related to the starting set. The document categories are notgiven a priori. We present the overall architecture and describe twonovel algorithms which provide significant improvement over HierarchicalAgglomeration Clustering and AutoClass algorithms and form the basis forthe query generation and search component of the agent. We report on theresults of our experiments comparing these new algorithms with moretraditional clustering algorithms and we show that our algorithms are fastand sacalable.
引用
收藏
页码:365 / 391
页数:26
相关论文
共 13 条
[1]  
Ackerman L. M.(1997)Learning Probabilistic User Profiles AI Magazine 18 47-56
[2]  
Anderson T. W.(1954)On Estimation of Parameters in Latent Structure Analysis Psychometrika 19 1-10
[3]  
Berry M. W.(1992)Large-Scale Sparse Singular Value Computations International Journal of Supercomputer Applications 6 13-49
[4]  
Berry M. W.(1995)Using Linear Algebra for Intelligent Information Retrieval SIAM Review 37 573-595
[5]  
Dumais S. T.(1990)Indexing by Latent Semantic Analysis J. Amer. Soc. Inform. Sci. 41 41-389
[6]  
O'Brien G. W.(1978)A Sentence-to-Sentence Clustering Procedure for Pattern Analysis IEEE Transactions on Systems, Man and Cybernetics 8 381-undefined
[7]  
Deerwester S.(undefined)undefined undefined undefined undefined-undefined
[8]  
Dumais S. T.(undefined)undefined undefined undefined undefined-undefined
[9]  
Furnas G. W.(undefined)undefined undefined undefined undefined-undefined
[10]  
Landauer T. K.(undefined)undefined undefined undefined undefined-undefined