A VP-accordant checkpointing protocol preventing useless checkpoints

被引:20
作者
Baldoni, R [1 ]
Quaglia, F [1 ]
Ciciani, B [1 ]
机构
[1] Univ Roma La Sapienza, Dipartimento Informat & Sistemist, I-00198 Rome, Italy
来源
SEVENTEENTH IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS | 1998年
关键词
D O I
10.1109/RELDIS.1998.740475
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A useless checkpoint corresponds to the occurrence of a checkpoint and communication pattern called Z-cycle. A recent result shows that ensuring a computation without Z-cycles is a particular application of a property, namely Virtual Precedence (VP), defined on an interval-based abstraction of a computation. In this paper we first propose a taxonomy of communication-induced checkpointing protocols based on the way they ensure the Vp property. Then we derive a sufficient condition ensuring no Z-cycles in a distributed computation. This condition defines a checkpoint and communication pattern, namely suspect Z-cycle, such that if no suspect Z-cycle exists in a distributed computation then no Z-cycle exists. We present finally a communication-induced checkpointing protocol that avoids useless checkpoints by preventing on-the-fly the formation of suspect Z-cycles and discuss its performance with respect to other protocols.
引用
收藏
页码:61 / 67
页数:7
相关论文
共 18 条
[1]   An index-based checkpointing algorithm for autonomous distributed systems [J].
Baldoni, R ;
Quaglia, F ;
Fornara, P .
SIXTEENTH SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 1997, :27-34
[2]  
BALDONI R, 1998, 1173 IRISA
[3]  
BALDONI R, 1997, P IEEE INT S FAULT T, P68
[4]  
BRIATICO D, 1984, P IEEE INT S REL DIS, P307
[5]   DISTRIBUTED SNAPSHOTS - DETERMINING GLOBAL STATES OF DISTRIBUTED SYSTEMS [J].
CHANDY, KM ;
LAMPORT, L .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1985, 3 (01) :63-75
[6]  
Elnozahy E., 1996, CMUCS96181 SCH COMP
[7]  
Fidge C., 1991, IEEE COMPUTER AUG, P28
[8]   Preventing useless checkpoints in distributed computations [J].
Helary, JM ;
Mostefaoui, A ;
Netzer, RHB ;
Raynal, M .
SIXTEENTH SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 1997, :183-190
[9]  
HELARY JM, 1998, TECHNIQUE SCI INFORM, V17
[10]  
HELARY JM, 1997, P 11 WORKSH DISTR AL