SYSTEM RELIABILITY-ANALYSIS OF AN N-VERSION PROGRAMMING APPLICATION

被引:22
作者
DUGAN, JB [1 ]
LYU, MR [1 ]
机构
[1] BELL COMMUN RES INC,MORRISTOWN,NJ 07960
关键词
N-VERSION PROGRAMMING (NVP); SOFTWARE FAULT TOLERANCE; FAULT TREE; MARKOV MODEL;
D O I
10.1109/24.370232
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a quantitative reliability analysis of a system designed to tolerate both hardware and software faults. The system achieves integrated fault tolerance by implementing N-version programming (NVP) on redundant hardware. The system analysis considers unrelated software faults, related software faults, transient hardware faults, permanent hardware faults, and imperfect coverage. The overall model is Markov in which the states of the Markov chain represent the long-term evolution of the system-structure. For each operational configuration, a fault-tree model captures the effects of software faults and transient hardware faults on the task computation. The software fault model is parameterized using experimental data associated with a recent implementation of an NVP system using the current design paradigm. The hardware model is parameterized by considering typical failure rates associated with hardware faults and coverage parameters. Our results show that it is important to consider both hardware & software faults in the reliability analysis of an NVP system, since these estimates vary with time. Moreover, the function for error detection & recovery is extremely important to fault-tolerant software. Several orders of magnitude reduction in system unreliability can be observed if this function is provided promptly.
引用
收藏
页码:513 / 519
页数:7
相关论文
共 16 条
[1]   DEPENDABILITY MODELING AND EVALUATION OF SOFTWARE FAULT-TOLERANT SYSTEMS [J].
ARLAT, J ;
KANOUN, K ;
LAPRIE, JC .
IEEE TRANSACTIONS ON COMPUTERS, 1990, 39 (04) :504-513
[2]   THE N-VERSION APPROACH TO FAULT-TOLERANT SOFTWARE [J].
AVIZIENIS, A .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1985, 11 (12) :1491-1501
[3]   COVERAGE MODELING FOR DEPENDABILITY ANALYSIS OF FAULT-TOLERANT SYSTEMS [J].
DUGAN, JB ;
TRIVEDI, KS .
IEEE TRANSACTIONS ON COMPUTERS, 1989, 38 (06) :775-787
[4]  
DUGAN JB, IN PRESS J SYSTEMS S
[5]   A THEORETICAL BASIS FOR THE ANALYSIS OF MULTIVERSION SOFTWARE SUBJECT TO COINCIDENT ERRORS [J].
ECKHARDT, DE ;
LEE, LD .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1985, 11 (12) :1511-1517
[6]   DISTRIBUTED EXECUTION OF RECOVERY BLOCKS - AN APPROACH FOR UNIFORM TREATMENT OF HARDWARE AND SOFTWARE FAULTS IN REAL-TIME APPLICATIONS [J].
KIM, KH ;
WELCH, HO .
IEEE TRANSACTIONS ON COMPUTERS, 1989, 38 (05) :626-636
[7]  
LALA JH, 1988, P IEEE INT S FAULT T, V18, P240
[8]   DEFINITION AND ANALYSIS OF HARDWARE-FAULT-TOLERANT AND SOFTWARE-FAULT-TOLERANT ARCHITECTURES [J].
LAPRIE, JC ;
ARLAT, J ;
BEOUNES, C ;
KANOUN, K .
COMPUTER, 1990, 23 (07) :39-51
[9]  
LAPRIE JC, 1992, IEEE T SOFTWARE ENG, V19, P130
[10]   CONCEPTUAL MODELING OF COINCIDENT FAILURES IN MULTIVERSION SOFTWARE [J].
LITTLEWOOD, B ;
MILLER, DR .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1989, 15 (12) :1596-1614