Information clustering based on fuzzy multisets

被引:76
作者
Miyamoto, S [1 ]
机构
[1] Univ Tsukuba, Inst Engn Mech & Syst, Tsukuba, Ibaraki 3058573, Japan
基金
日本学术振兴会;
关键词
information retrieval; data clustering; fuzzy multiset; cluster center; algorithm;
D O I
10.1016/S0306-4573(02)00047-X
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A fuzzy multiset model for information clustering is proposed with application to information retrieval on the World Wide Web. Noting that a search engine retrieves multiple occurrences of the same subjects with possibly different degrees of relevance, we observe that fuzzy multisets provide an appropriate model of information retrieval on the WWW. Information clustering which means both term clustering and document clustering is considered. Three methods of the hard c-means, fuzzy c-means, and an agglomerative method using cluster centers are proposed. Two distances between fuzzy multisets and algorithms for calculating cluster centers are defined. Theoretical properties concerning the clustering algorithms are studied. Illustrative examples are given to show how the algorithms work. (C) 2002 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:195 / 213
页数:19
相关论文
共 16 条
[1]  
BEZDEK JC, 1981, PATTERN RECONGITION
[2]  
Blizard W. D., 1989, Notre Dame Journal of Formal Logic, V30, P36, DOI 10.1305/ndjfl/1093634995
[3]  
Everitt B., 1993, CLUSTER ANAL
[4]  
Knuth D.E., 1969, ART COMPUTER PROGRAM, V2, DOI 10.2307/2283757
[5]  
LI B, 1988, COMPUT MATH APPL, V15, P811
[6]  
Manna Z., 1985, The logical basis for computer programming. Volume, V1
[7]  
Miyamoto S., 1995, Control and Cybernetics, V24, P421
[8]  
Miyamoto S, 1998, J AM SOC INFORM SCI, V49, P195, DOI 10.1002/(SICI)1097-4571(199803)49:3<195::AID-ASI2>3.0.CO
[9]  
2-K
[10]  
MIYAMOTO S, 1990, FUZZY SETS INFORMATI