Complete and fragmented replica selection and retrieval in Data Grids

被引:33
作者
Chang, Ruay-Shiung [1 ]
Chen, Po-Hung [1 ]
机构
[1] Natl Dong Hwa Univ, Dept Comp Sci & Informat Engn, Hualien 974, Taiwan
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2007年 / 23卷 / 04期
关键词
Data Grid; replication; dynamic self-adaptive replica location; fragmented replication;
D O I
10.1016/j.future.2006.09.006
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data Grids support data-intensive applications in wide area Grid systems. They utilize local storage systems as distributed data stores by replicating datasets. Replication is a commonly used technique in a distributed environment. The motivation of replication is that replication can improve data availability, data access performance, and load balancing. Usually a complete file is copied to many Grid sites for local access. However, a site may only need parts of a replica. Therefore, to use the storage systems efficiently, it is necessary for a Grid site to store only parts of a replica. In this paper, we propose a concept called fragmented replicas. That is, when doing replication, a site can store only some partial contents needed locally. It can greatly save the storage space wasted in storing unused data. We also propose a block mapping procedure to determine the distribution of blocks in every available server for later replica retrieval. According to this procedure, a server can provide its available partial replica contents for other members in the Grid system to access. On the other hand, a client can retrieve a fragmented replica directly by using the block mapping procedure. After the block mapping procedure, some co-allocation schemes can be used to retrieve data sets from the available servers. The simulation shows that the co-allocation schemes also improve download performance in a fragmented replication system. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:536 / 546
页数:11
相关论文
共 17 条
[1]  
[Anonymous], 2004, The Grid: Blueprint for a New Computing Infrastructure. Vol
[2]  
[Anonymous], PAR DISTR PROC S P I
[3]  
Chang RS, 2004, LECT NOTES COMPUT SC, V3358, P584
[4]   ON A SCHEDULING PROBLEM WHERE A JOB CAN BE EXECUTED ONLY BY A LIMITED NUMBER OF PROCESSORS [J].
CHANG, RS ;
LEE, RCT .
COMPUTERS & OPERATIONS RESEARCH, 1988, 15 (05) :471-478
[5]  
Ferreira L., 2003, INTRO GRID COMPUTING
[6]   The anatomy of the grid: Enabling scalable virtual organizations [J].
Foster, I ;
Kesselman, C ;
Tuecke, S .
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2001, 15 (03) :200-222
[7]   The globus project: A status report [J].
Foster, I ;
Kesselman, C .
SEVENTH HETEROGENEOUS COMPUTING WORKSHOP (HCW '98), 1998, :4-18
[8]  
FOSTER I, 2002, OPEN GRID SERV INFRA
[9]  
LAMNITCHI A, 2005, P2P COMP INTER GRIDS, V21
[10]   Dynamic self-adaptive replica location method in data grids [J].
Li, DS ;
Xiao, N ;
Lu, XC ;
Wang, YJ ;
Lu, K .
IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, PROCEEDINGS, 2003, :442-445