Mining navigation patterns using a sequence alignment method

被引:29
作者
Hay, B [1 ]
Wets, G [1 ]
Vanhoof, K [1 ]
机构
[1] Limburgs Univ Ctr, Fac Appl Econ Sci, B-3590 Diepenbeek, Belgium
关键词
clustering; sequence analysis; web usage mining;
D O I
10.1007/s10115-003-0109-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, a new method is illustrated for mining navigation patterns on a web site. Instead of clustering patterns by means of a Euclidean distance measure, in this approach users are partitioned into clusters using a non-Euclidean distance measure called the Sequence Alignment Method (SAM). This method partitions navigation patterns according to the order in which web pages are requested and handles the problem of clustering sequences of different lengths. The performance of the algorithm is compared with the results of a method based on Euclidean distance measures. SAM is validated by means of user-traffic data of two different web sites. Empirical results show that SAM identifies sequences with similar behavioral patterns not only with regard to content, but also considering the order of pages visited in a sequence.
引用
收藏
页码:150 / 163
页数:14
相关论文
共 31 条
[1]  
[Anonymous], P 8 INT FUZZ SYST AS
[2]  
[Anonymous], 1980, CLUSTER ANAL
[3]  
[Anonymous], P 5 INT C EXT DAT TE
[4]  
Borges J, 2000, LECT NOTES COMPUT SC, V1836, P92
[5]  
Buchner AG, 1998, ACM SIGMOD RECORD, V27, P54, DOI DOI 10.1145/306101.306124
[6]  
BUCHNER AG, 1999, ACM WORKSH WEB US AN
[7]  
CADEZ I, 2000, VISUALIZATION NAVIGA
[8]  
CAPRI, 2001, GENERIC SEQUENCE DIS
[9]   CHARACTERIZING BROWSING STRATEGIES IN THE WORLD-WIDE-WEB [J].
CATLEDGE, LD ;
PITKOW, JE .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1995, 27 (06) :1065-1073
[10]  
Cooley R., 1999, Knowledge and Information Systems, V1, P5