Websom for textual data mining

被引:54
作者
Lagus, K [1 ]
Honkela, T [1 ]
Kaski, S [1 ]
Kohonen, T [1 ]
机构
[1] Aalto Univ, Neural Networks Res Ctr, Lab Comp & Informat Sci, FIN-02015 Espoo, Finland
关键词
data mining; document filtering; exploratory data analysis; information retrieval; self-organizing map; SOM; text document collection; WEBSOM;
D O I
10.1023/A:1006586221250
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
New methods that are user-friendly and efficient are needed for guidance among the masses of textual information available in the Internet and the World Wide Web. We have developed a method and a tool called the WEBSOM which utilizes the self-organizing map algorithm (SOM) for organizing large collections of text documents onto visual document maps. The approach to processing text is statistically oriented, computationally feasible, and scalable - over a million text documents have been ordered on a single map. In the article we consider different kinds of information needs and tasks regarding organizing, visualizing, searching, categorizing and filtering textual data. Furthermore, we discuss and illustrate with examples how document maps can aid in these situations. An example is presented where a document map is utilized as a tool for visualizing and filtering a stream of incoming electronic mail messages.
引用
收藏
页码:345 / 364
页数:20
相关论文
共 49 条
[1]  
Anderberg M.R., 1973, Probability and Mathematical Statistics
[2]  
[Anonymous], THESIS HELSINKI U TE
[3]  
[Anonymous], P ICANN 1995 PAR EC2
[4]  
CALLANT SI, 1992, ACM SIGIR FORUM, V26, P34
[6]  
Deboeck G., 1998, VISUAL EXPLORATIONS
[7]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
[8]  
2-9
[9]  
Golub GH, 2013, Matrix Computations, V4
[10]   VLSI TECHNOLOGIES FOR ARTIFICIAL NEURAL NETWORKS [J].
GOSER, K ;
HILLERINGMANN, U ;
RUECKERT, U ;
SCHUMACHER, K .
IEEE MICRO, 1989, 9 (06) :28-44