The Grid2003 Production Grid: Principles and practice

被引:19
作者
Foster, I [1 ]
Gieraltowski, J [1 ]
Gose, S [1 ]
Maltsev, N [1 ]
May, E [1 ]
Rodriguez, A [1 ]
Sulakhe, D [1 ]
Vaniachine, A [1 ]
Shank, J [1 ]
Youssef, S [1 ]
Adams, D [1 ]
Baker, R [1 ]
Deng, W [1 ]
Smith, J [1 ]
Yu, D [1 ]
Legrand, I [1 ]
Singh, S [1 ]
Steenberg, C [1 ]
Xia, Y [1 ]
Afaq, A [1 ]
Berman, E [1 ]
Annis, J [1 ]
Bauerdick, LAT [1 ]
Ernst, M [1 ]
Fisk, I [1 ]
Giacchetti, L [1 ]
Graham, G [1 ]
Heavey, A [1 ]
Kaiser, J [1 ]
Kuropatkin, N [1 ]
Pordes, R [1 ]
Sekhri, V [1 ]
Weigand, J [1 ]
Wu, Y [1 ]
Baker, K [1 ]
Sorrillo, L [1 ]
Huth, J [1 ]
Allen, M [1 ]
Grundhoefer, L [1 ]
Hicks, J [1 ]
Luehring, F [1 ]
Peck, S [1 ]
Quick, R [1 ]
Simms, S [1 ]
Fekete, G [1 ]
vandenBerg, J [1 ]
Cho, K [1 ]
Kwon, K [1 ]
Son, D [1 ]
Park, H [1 ]
机构
[1] Argonne Natl Lab, Argonne, IL 60439 USA
来源
13TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, PROCEEDINGS | 2004年
关键词
D O I
10.1109/HPDC.2004.1323544
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The Grid2003 Project has deployed a multi-virtual organization, application-driven grid laboratory ("Grid3") that has sustained for several months the production-level services required by physics experiments of the Large Hadron Collider at CERN (ATLAS and CMS), the Sloan Digital Sky Survey project, the gravitational wave search experiment LIGO, the BTeV experiment at Fermilab, as well as applications in molecular structure analysis and genome analysis, and computer science research projects in such areas as job and data scheduling. The deployed infrastructure has been operating since November 2003 with 2 7 sites, a peak of 2800 processors, work loads from 10 different applications exceeding 1300 simultaneous jobs, and data transfers among sites of greater than 2 TB/day. We describe the principles that have guided the development of this unique infrastructure and the practical experiences that have resulted from its creation and use. We discuss application requirements for grid services deployment and configuration, monitoring infrastructure, application performance, metrics, and operational experiences. We also summarize lessons learned.
引用
收藏
页码:236 / 245
页数:10
相关论文
共 35 条
  • [11] DEELMAN E, 2004, 2 EUROPEAN ACROSS GR
  • [12] Overview of the I-way: Wide area visual supercomputing
    DeFanti, TA
    Foster, I
    Papka, ME
    Stevens, R
    Kuhfuss, T
    [J]. INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1996, 10 (2-3): : 123 - 131
  • [13] *EU DATAGRID JAV S, VOMS ARCH V1 1
  • [14] EVANS D, 2003, CHEP 2003 JOLL CAL
  • [15] The anatomy of the grid: Enabling scalable virtual organizations
    Foster, I
    Kesselman, C
    Tuecke, S
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2001, 15 (03) : 200 - 222
  • [16] Globus: A metacomputing infrastructure toolkit
    Foster, I
    Kesselman, C
    [J]. INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1997, 11 (02): : 115 - 128
  • [17] FOSTER I, 1998, 5 ACM C COMP COMM SE
  • [18] FOSTER I, 2002, 14 INT C SCI STAT DA
  • [19] Condor-G: A Computation Management Agent for Multi-Institutional Grids
    James Frey
    Todd Tannenbaum
    Miron Livny
    Ian Foster
    Steven Tuecke
    [J]. Cluster Computing, 2002, 5 (3) : 237 - 246
  • [20] GOSE S, ENTRADA LIGHTWEIGHT