Text retrieval using self-organized document maps

被引:14
作者
Lagus, K [1 ]
机构
[1] Helsinki Univ Technol, Neural Networks Res Ctr, FIN-02015 Espoo, Finland
关键词
document maps; information retrieval; LSI; SOM; text mining;
D O I
10.1023/A:1013853012954
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A map of text documents arranged using the Self-Organizing Map (SOM) algorithm (1) is organized in a meaningful manner so that items with similar content appear at nearby locations of the 2-dimensional map display, and (2) clusters the data, resulting in an approximate model of the data distribution in the high-dimensional document space. This article describes how a document map that is automatically organized for browsing and visualization can be successfully utilized also in speeding up document retrieval. Furthermore, experiments on the well-known CISI collection [3] show significantly improved performance compared to Salton's vector space model, measured by average precision (AP) when retrieving a small, fixed number of best documents. Regarding comparison with Latent Semantic Indexing the results are inconclusive.
引用
收藏
页码:21 / 29
页数:9
相关论文
共 19 条
[1]  
[Anonymous], 1997, SPRINGER SERIES INFO
[2]  
BAEZAYATES RA, 1999, MODERN INFORMATION R
[3]  
Chen HC, 1998, J AM SOC INFORM SCI, V49, P582, DOI 10.1002/(SICI)1097-4571(1998)49:7<582::AID-ASI2>3.0.CO
[4]  
2-V
[5]  
*CISI COLL, 1981, CISI REF COLL INF RE
[6]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
[7]  
2-9
[8]  
Hearst M. A., 1999, Modern Information Retrieval, P257
[9]  
Honkela T., 1996, A32 HELS U TECHN LAB
[10]   WEBSOM - Self-organizing maps of document collections [J].
Kaski, S ;
Honkela, T ;
Lagus, K ;
Kohonen, T .
NEUROCOMPUTING, 1998, 21 (1-3) :101-117