Clustering transactions using large items

被引:74
作者
Wang, K [1 ]
Xu, C [1 ]
Liu, B [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117548, Singapore
来源
PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON INFORMATION KNOWLEDGE MANAGEMENT, CIKM'99 | 1999年
关键词
D O I
10.1145/319950.320054
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In traditional data clustering, similarity of a cluster of objects is measured by pairwise similarity of objects in that cluster. We argue that such measures are not appropriate for transactions that are sets of items. We propose the notion of large items, i.e., items contained in some minimum fraction of transactions in a cluster, to measure the similarity of a cluster of transactions. The intuition of our clustering criterion is that there should be many large items within a cluster and little overlapping of such items across clusters. We discuss the rationale behind our approach and its implication on providing a better solution to the clustering problem. We present a clustering algorithm based on the new clustering criterion and evaluate its effectiveness.
引用
收藏
页码:483 / 490
页数:8
相关论文
共 16 条
[1]  
Agrawal R., 1993, P 1993 ACM SIGMOD IN, P207
[2]  
BRODER AZ, 1997, WWW C
[3]  
Cheeseman P, 1996, ADV KNOWLEDGE DISCOV
[4]  
CUTTING DR, 1992, C RES DEV INF RETR C, P318
[5]  
Ester M., 1996, DENSITY BASED ALGORI
[6]  
ESTER M, 1998, INCREMENTAL CLUSTERI
[7]  
GUBA S, 1999, CLUSTERING ALGORITHM
[8]  
Han E.-H., 1997, SIGMOD WORKSH RES IS
[9]  
Jain K., 1988, DUBES ALGORITHMS CLU
[10]  
Kaufman L, 2009, FINDING GROUPS DATA