GHIC: A hierarchical pattern-based clustering algorithm for grouping Web transactions

被引:19
作者
Yang, YH
Padmanabhan, B
机构
[1] Univ Calif Davis, Grad Sch Management, Davis, CA 95616 USA
[2] Univ Penn, Wharton Sch, OPIM Dept, Philadelphia, PA 19104 USA
关键词
data mining; clustering; classification; association rules; Web mining;
D O I
10.1109/TKDE.2005.145
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Grouping customer transactions into segments may help understand customers better. The marketing literature has concentrated on identifying important segmentation variables (e.g., customer loyalty) and on using cluster analysis and mixture models for segmentation. The data mining literature has provided various clustering algorithms for segmentation without focusing specifically on clustering customer transactions. Building on the notion that observable customer transactions are generated by latent behavioral traits, in this paper, we investigate using a pattern-based clustering approach to grouping customer transactions. We define an objective function that we maximize in order to achieve a good clustering of customer transactions and present an algorithm, GHIC, that groups customer transactions such that itemsets generated from each cluster, while similar to each other, are different from ones generated from others. We present experimental results from user-centric Web usage data that demonstrates that GHIC generates a highly effective clustering of transactions.
引用
收藏
页码:1300 / 1304
页数:5
相关论文
共 7 条
[1]  
HAN E, 1997, TR97019 U MINN DEP C
[2]  
Karypis G, 1997, DES AUT CON, P526, DOI 10.1145/266021.266273
[3]  
KIMBROUGH S, 2000, P WORKSH INF TECHN S, P43
[4]  
STEINBACH M, 2000, P INT C KNOWL DISC D
[5]  
Wang Haixun., 2002, P ACM SIGMOD INT C M, P394
[6]   Clustering transactions using large items [J].
Wang, K ;
Xu, C ;
Liu, B .
PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON INFORMATION KNOWLEDGE MANAGEMENT, CIKM'99, 1999, :483-490
[7]  
Yang Y., 2002, P 8 ACM SIGKDD INT C, P682, DOI DOI 10.1145/775047.775149