WEBSOM - Self-organizing maps of document collections

被引:194
作者
Kaski, S [1 ]
Honkela, T [1 ]
Lagus, K [1 ]
Kohonen, T [1 ]
机构
[1] Aalto Univ, Neural Networks Res Ctr, FIN-02015 Helsinki, Finland
基金
芬兰科学院;
关键词
data mining; information retrieval; self-organizig map; SOM; WEBSOM;
D O I
10.1016/S0925-2312(98)00039-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the WEBSOM method a textual document collection may be organized onto a graphical map display that provides an overview of the collection and facilitates interactive browsing. Interesting documents can be located on the map using a content-directed search. Each document is encoded as a histogram of word categories which are formed by the self-organizing map (SOM) algorithm based on the similarities in the contexts of the words. The encoded documents an organized on another self-organizing map, a document map, on which nearby locations contain similar documents. Special consideration is given to the computation of very large document maps which is possible with general-purpose computers if the dimensionality of the word category histograms is first reduced with a random mapping method and if computationally efficient algorithms are used in computing the SOMs. (C) 1998 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:101 / 117
页数:17
相关论文
共 42 条
  • [1] [Anonymous], 1997, Data exploration using self-organizing maps, DOI DOI 10.1111/fwb.12264
  • [2] [Anonymous], P ICANN 1995 PAR EC2
  • [3] Using linear algebra for intelligent information retrieval
    Berry, MW
    Dumais, ST
    OBrien, GW
    [J]. SIAM REVIEW, 1995, 37 (04) : 573 - 595
  • [4] Internet categorization and search: A self-organizing approach
    Chen, HC
    Schuffels, C
    Orwig, R
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 1996, 7 (01) : 88 - 102
  • [5] DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
  • [6] 2-9
  • [7] FINCH S, 1992, ARTIFICIAL NEURAL NETWORKS, 2, VOLS 1 AND 2, P1365
  • [8] GALLANT SI, 1992, ACM SIGIR FORUM, V26, P34
  • [9] Honkela T., 1998, Classification, Data Analysis, and Data Highways. Proceedings of the 21st Annual Conference of the Gesellschaft fur Klassifikation e.V, P245
  • [10] Honkela T, 1996, IEEE IJCNN, P56, DOI 10.1109/ICNN.1996.548866