A hierarchical and multiscale approach to analyze E-business workloads

被引:30
作者
Menascé, DA
Almeida, VAF
Riedi, R
Ribeiro, F
Fonseca, R
Meira, W
机构
[1] George Mason Univ, Dept Comp Sci, Fairfax, VA 22030 USA
[2] Univ Fed Minas Gerais, Dept Comp Sci, BR-31270 Belo Horizonte, MG, Brazil
[3] Rice Univ, Dept Elect & Comp Engn, Houston, TX 77251 USA
基金
美国国家科学基金会;
关键词
E-business; WWW; workload characterization; performance modeling; heavy-tailed distribution;
D O I
10.1016/S0166-5316(02)00228-6
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Understanding the characteristics of electronic business (E-business) workloads is a crucial step to improve the quality of service offered to customers in E-business environments. This paper proposes a hierarchical and multiple time scale approach to characterize E-business workloads. The three levels of the hierarchy are user, application, and protocol, and are associated with customer sessions, functions requested, and HTTP requests, respectively. Within each layer, an analysis across several time scales is conducted. The approach is illustrated by presenting a detailed characterization of two actual E-business sites: an online bookstore and an electronic auction site. Our analysis of the workloads showed that the session length, measured in number of requests to execute E-business functions, is heavy-tailed, especially for sites subject to requests generated by robots. An overwhelming majority of the sessions consist of only a handful requests, which seems to suggest that most customers are human (as opposed to robots). A significant fraction of the functions requested by customers were found to be product selection functions as opposed to product ordering. An analysis of the popularity of search terms revealed that it follows a Zipf distribution. However, Zipf's law as applied to E-business is time scale dependent due to the shift in popularity of search terms. We also found that requests to execute frequent E-business functions exhibit a pattern similar to the HTTP request arrival process. Finally, we demonstrated that there is a strong correlation in the arrival process at the HTTP request level. These correlations are particularly stronger at intermediate time scales of a few minutes. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:33 / 57
页数:25
相关论文
共 23 条
[1]  
Abry P., 1995, LECT NOTES STAT, P15, DOI DOI 10.1007/978-1-4612-2544-7_2
[2]  
ABRY P, 2000, P SELF SIM NETW TRAF
[3]  
ALMEIDA V, 1996, P 6 WORKSH WEB CACH
[4]  
ALMEIDA V, 2001, P 6 WORKSH WEB CACH
[5]  
ALMEIDA V, 2002, P 2002 ACM SIGMETRIC, V30
[6]  
ALMEIDA V, 1996, P IEEE ACM INT C PAR
[7]  
[Anonymous], 1949, Human behaviour and the principle of least-effort
[8]  
ARLITT M, 1996, P 1996 SIGMETRICS C
[9]  
Arlitt M., 2001, ACM T INTERNET TECHN, V1, P44
[10]  
CHERKASOVA L, 1998, HPL98119