A concept-driven algorithm for clustering search results

被引:136
作者
Osinski, S [1 ]
Weiss, D [1 ]
机构
[1] Poznan Univ Tech, Lab Intelligent Decis Support Syst, Poznan, Poland
关键词
D O I
10.1109/MIS.2005.38
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Lingo algorithm combines common phrase discovery and latent semantic indexing techniques to separate search results into meaningful groups. It looks for meaningful phrases to use as cluster labels and then assigns documents to the labels to form groups. The algorithm uses the vector space model (VSM) and singular value decomposition (SVD) method. VSM is a method of information retrieval that uses linear-algebra operations to compare textual data. Unlike VSM, LSI aims to represent the input collection using concepts found in the documents rather than the literal terms appearing in them.
引用
收藏
页码:48 / 54
页数:7
相关论文
共 11 条
[1]  
[Anonymous], P 99 INF RES MAN ASS
[2]  
Berry M.W., 1994, UTCS94270
[3]  
DOM BE, 2001, 10219 IBM RES
[4]  
DONG Z, 2002, THESIS SE U NANJING
[5]  
HEARST MA, 1996, P 19 ANN INT ACM SIG, P76
[6]   Routing and wavelength assignment in GMPLS networks [J].
Hua, Y ;
Xu, W ;
Wu, CL .
PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PDCAT'2003, PROCEEDINGS, 2003, :268-271
[7]  
MAAREK YS, 2000, 10186 IBM RES
[8]  
Osinski S, 2004, ADV SOFT COMP, P369
[9]  
SALTON G, 1983, INTRO MODERN INFORMA
[10]  
STEFANOWSKI J, 2003, LNCS, V2663