A Survey of Web Clustering Engines

被引:184
作者
Carpineto, Claudio [1 ]
Osinski, Stanislaw
Romano, Giovanni [1 ]
Weiss, Dawid [2 ]
机构
[1] Fdn Ugo Bordoni, I-00142 Rome, Italy
[2] Poznan Univ Tech, PL-60965 Poznan, Poland
关键词
Algorithms; Experimentation; Measurement; Performance; Information retrieval; meta search engines; text clustring; search results clustering; user interfaces; INFORMATION-RETRIEVAL; SEARCH;
D O I
10.1145/1541880.1541884
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Web clustering engines organize search results by topic, thus offering a complementary view to the flat-ranked list returned by conventional search engines. In this survey, we discuss the issues that must be addressed in the development of a Web clustering engine, including acquisition and preprocessing of search results, their clustering and visualization. Search results clustering, the core of the system, has specific requirements that cannot be addressed by classical clustering algorithms. We emphasize the role played by the quality of the cluster labels as opposed to optimizing only the clustering structure. We highlight the main characteristics of a number of existing Web clustering engines and also discuss how to evaluate their retrieval performance. Some directions for future research are finally presented.
引用
收藏
页数:38
相关论文
共 114 条
[1]  
Abney S.P., 1991, Principle-Based Parsing Studies in Linguistics and Philosophy, P257, DOI [DOI 10.1007/978-94-011-3474-310, 10.1007/978-94-011-3474-3_10, 10.1007/978-94-011-3474-310]
[2]  
ALLEN RB, 1993, P ACM C ORG COMP SYS, P166
[3]  
Alonso O., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P597, DOI 10.1145/1148170.1148273
[4]  
ANAGNOSTOPOULOS A, 2006, P 15 ACM INT C INF K, P208
[5]  
[Anonymous], 2005, P 28 ANN INT ACM SIG, DOI DOI 10.1145/1076034
[6]  
[Anonymous], 2002, SIGKDD Explorations, DOI DOI 10.1145/568574.568575
[7]  
[Anonymous], P 27 INT ACM SIGIR C
[8]  
[Anonymous], P INT C WORLD WID WE
[9]   Personalized hierarchical clustering [J].
Bade, Korinna ;
Nuernberger, Andreas .
2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, :181-+
[10]  
Brodaty H., 2002, BRAIN AGING, V2, P3