OPTIMAL STRATEGIES FOR SCHEDULING CHECKPOINTS AND PREVENTIVE MAINTENANCE

被引:25
作者
COFFMAN, EG
GILBERT, EN
机构
[1] AT&T Bell Laboratories, Murray Hill, New Jersey 07974-2070
[2] AT&T Bell Laboratories, Murray Hill, New Jersey 07974-2070
关键词
Optimization; Preventive maintenance; Scheduling;
D O I
10.1109/24.52636
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
At checkpoints during the operation of a computer, the state of the system is saved. Whenever a machine fails, it is repaired and then reset to the state saved at the latest checkpoint. In this paper, save times are known constants and repair times are random variables; failures are the epochs of a given renewal process. In scheduling the checkpoints, the cost of saves must be traded off against the cost of work lost when the computer fails. We show how to schedule checkpoints to minimize the mean total-time to finish a given job. We obtain similar optimization results for the tails of the distribution of the finishing time, and certain variants of the basic model. © 1990, IEEE.
引用
收藏
页码:9 / 18
页数:10
相关论文
共 7 条
[1]   SELECTION OF A CHECKPOINT INTERVAL IN A CRITICAL-TASK ENVIRONMENT [J].
GEIST, R ;
REYNOLDS, R ;
WESTALL, J .
IEEE TRANSACTIONS ON RELIABILITY, 1988, 37 (04) :395-400
[2]   RELIABILITY OF SYSTEMS WITH LIMITED REPAIRS [J].
GOYAL, A ;
NICOLA, VF ;
TANTAWI, AN ;
TRIVEDI, KS .
IEEE TRANSACTIONS ON RELIABILITY, 1987, 36 (02) :202-207
[3]   ANALYSIS OF A CLASS OF RECOVERY PROCEDURES [J].
KOREN, I ;
KOREN, Z ;
SU, SYH .
IEEE TRANSACTIONS ON COMPUTERS, 1986, 35 (08) :703-712
[4]  
KULKARNI VG, RC13283 IBM RES RES
[5]   PERFORMANCE ANALYSIS OF CHECKPOINTING STRATEGIES [J].
TANTAWI, AN ;
RUSCHITZKA, M .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1984, 2 (02) :123-144
[6]   ON THE OPTIMUM CHECKPOINT SELECTION PROBLEM [J].
TOUEG, S ;
BABAOGLU, O .
SIAM JOURNAL ON COMPUTING, 1984, 13 (03) :630-649
[7]  
TRIVEDI KS, 1983, MATH COMPUTER PERFOR, P403