Data management and transfer in high-performance computational grid environments

被引:312
作者
Allcock, B
Bester, J
Bresnahan, J
Chervenak, AL
Foster, I
Kesselman, C
Meder, S
Nefedova, V
Quesnel, D
Tuecke, S
机构
[1] Argonne Natl Lab, Div Math & Comp Sci, Argonne, IL 60439 USA
[2] Univ So Calif, Inst Informat Sci, Marina Del Rey, CA 90292 USA
[3] Univ Chicago, Dept Comp Sci, Computat Inst, Chicago, IL 60637 USA
基金
美国国家科学基金会;
关键词
globus; data grid; GridFTP; replica management;
D O I
10.1016/S0167-8191(02)00094-7
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
An emerging class of data-intensive applications involve the geographically dispersed extraction of complex scientific information from very large collections of measured or computed data. Such applications arise, for example, in experimental physics, where the data in question is generated by accelerators, and in simulation science, where the data is generated by super-computers. So-called Data Grids provide essential infrastructure for such applications, much as the Internet provides essential services for applications such as e-mail and the Web. We describe here two services that we believe are fundamental to any Data Grid: reliable, high-speed transport and replica management. Our high-speed transport service, GridFTP, extends the popular FTP protocol with new features required for Data Grid applications, such as striping and partial file access. Our replica management service integrates a replica catalog with GridFTP transfers to provide for the creation, registration, location, and management of dataset replicas. We present the design of both services and also preliminary performance results. Our implementations exploit security and other services provided by the Globus Toolkit. (C) 2002 Published by Elsevier Science B.V.
引用
收藏
页码:749 / 771
页数:23
相关论文
共 16 条
  • [1] BARU C, 1998, P CASCON 98 C
  • [2] BARU C, 1998, P CASCON 98 C NOV 30
  • [3] Beynon M., 2000, P 8 GODD C MASS STOR, P119
  • [4] The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets
    Chervenak, A
    Foster, I
    Kesselman, C
    Salisbury, C
    Tuecke, S
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2000, 23 (03) : 187 - 200
  • [5] The anatomy of the grid: Enabling scalable virtual organizations
    Foster, I
    Kesselman, C
    Tuecke, S
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2001, 15 (03) : 200 - 222
  • [6] Globus: A metacomputing infrastructure toolkit
    Foster, I
    Kesselman, C
    [J]. INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1997, 11 (02): : 115 - 128
  • [7] Foster I, 1999, GRID BLUEPRINT NEW C
  • [8] GUNTER D, 2000, P IEEE MASC 2000 C M
  • [9] HOLTMAN K, 2000, P 4 ANN GLOB RETR PI
  • [10] HOSCHEK W, 2000, 2000 INT WORKSH GRID