Web log data warehousing and mining for intelligent web caching

被引:36
作者
Bonchi, F
Giannotti, F
Gozzi, C
Manco, G
Nanni, M
Pedreschi, D
Renso, C
Ruggieri, S
机构
[1] Univ Pisa, Dipartimento Informat, I-56125 Pisa, Italy
[2] Univ Pisa, Dipartimento Informat, I-56125 Pisa, Italy
关键词
web caching; log data warehousing; data mining; frequent patterns; association rules; decision trees;
D O I
10.1016/S0169-023X(01)00038-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce intelligent web caching algorithms that employ predictive models of web requests; the general idea is to extend the least recently used (LRU) policy of web and proxy, servers by making it sensitive to web access models extracted from web log data using data mining techniques. Two approaches have been studied in particular, frequent patterns and decision trees. The experimental results of the new algorithms show substantial improvement over existing LRU-based caching techniques, in terms of hit rate. We designed and developed a prototypical system, which supports data warehousing of web log data, extraction of data mining models and simulation of the web caching algorithms. (C) 2001 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:165 / 189
页数:25
相关论文
共 45 条
[1]  
ABRAMS M, 1995, P 4 INT WORLD WID WE, P119
[2]  
Aggarwal C. C., 1997, Proceedings of the Sixth International Conference on Information and Knowledge Management. CIKM'97, P238, DOI 10.1145/266714.266904
[3]  
AGGARWAL CC, 1997, RC20619 IBM RES DIV
[4]  
Agrawal R., 1994, P 20 INT C VER LARG, V1215, P487
[5]  
Albers Susanne, 1999, ACM COMPUTING SURVEY, V31
[6]  
Arlitt M, 1998, LECT NOTES COMPUT SC, V1469, P193
[7]   Internet Web servers: Workload characterization and performance implications [J].
Arlitt, MF ;
Williamson, CL .
IEEE-ACM TRANSACTIONS ON NETWORKING, 1997, 5 (05) :631-645
[8]  
Bar-Noy A., 2000, Proceedings of the Thirty Second Annual ACM Symposium on Theory of Computing, P735, DOI 10.1145/335305.335410
[9]   Changes in Web client access patterns: Characteristics and caching implications [J].
Barford P. ;
Bestavros A. ;
Bradley A. ;
Crovella M. .
World Wide Web, 1999, 2 (1-2) :15-28
[10]   A STUDY OF REPLACEMENT ALGORITHMS FOR A VIRTUAL-STORAGE COMPUTER [J].
BELADY, LA .
IBM SYSTEMS JOURNAL, 1966, 5 (02) :78-&