A higher order estimate of the optimum checkpoint interval for restart dumps

被引:259
作者
Daly, JT [1 ]
机构
[1] Los Alamos Natl Lab, Los Alamos, NM 87545 USA
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING THEORY METHODS AND APPLICATIONS | 2006年 / 22卷 / 03期
关键词
optimal checkpointing; Poisson failures; perturbation series; Lambert function;
D O I
10.1016/j.future.2004.11.016
中图分类号
TP301 [理论、方法];
学科分类号
081202 [计算机软件与理论];
摘要
This paper examines methods of approximating the optimum checkpoint restart strategy for minimizing application run time oil a system exhibiting Poisson single component failures. Two different models will be developed and compared. We will begin with a simplified cost function that yields a first-order model. Then we will derive a more complete cost function and demonstrate a perturbation solution that provides accurate high order approximations to the optimum checkpoint interval, (c) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:303 / 312
页数:10
相关论文
共 7 条
[1]
Daly J, 2003, LECT NOTES COMPUT SC, V2660, P3
[2]
ON THE OPTIMAL TOTAL PROCESSING TIME USING CHECKPOINTS [J].
DIMITROV, B ;
KHALIL, Z ;
KOLEV, N ;
PETROV, P .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1991, 17 (05) :436-442
[3]
Kwak SW, 2001, IEEE T RELIAB, V50, P293, DOI 10.1109/24.974127
[4]
A variational calculus approach to optimal checkpoint placement [J].
Ling, YB ;
Mi, J ;
Lin, XL .
IEEE TRANSACTIONS ON COMPUTERS, 2001, 50 (07) :699-708
[5]
NIST/ SEMATECH, E HDB STAT METH
[6]
Impact of checkpoint latency on overhead ratio of a checkpointing scheme [J].
Vaidya, NH .
IEEE TRANSACTIONS ON COMPUTERS, 1997, 46 (08) :942-947
[7]
FIRST-ORDER APPROXIMATION TO OPTIMUM CHECKPOINT INTERVAL [J].
YOUNG, JW .
COMMUNICATIONS OF THE ACM, 1974, 17 (09) :530-531