Approximate reliability and availability models for high availability and fault-tolerant systems with repair

被引:12
作者
Bowles, JB [1 ]
Dobbins, JG [1 ]
机构
[1] Univ S Carolina, Columbia, SC 29208 USA
关键词
approximate models; Markov model; reliability evaluation; availability evaluation; redundant system; repairable system; fault-tolerant system;
D O I
10.1002/qre.577
中图分类号
T [工业技术];
学科分类号
08 [工学];
摘要
Systems designed for high availability and fault tolerance are often configured as a series combination of redundant subsystems. When a unit of a subsystem fails, the system remains operational while the failed unit is repaired; however, if too many units in a subsystem fail concurrently, the system fails. Under conditions usually met in practical situations, we show that the reliability and availability of such systems can be accurately modeled by representing each redundant subsystem with a constant, 'effective' failure rate equal to the inverse of the subsystem mean-time-to-failure (MTTF). The approximation model is surprisingly accurate, with an error on the order of the square of the ratio mean-time-to-repair to mean-time-to-failure (MTTR/MTTF), and it has wide applicability for commercial, high-availability and fault-tolerant computer systems. The effective subsystem failure rates can be used to: (1) evaluate the system and subsystem reliability and availability; (2) estimate the system MTTF, and (3) provide a basis for the iterative analysis of large complex systems. Some observations from renewal theory suggest that the approximate models can be used even when the unit failure rates are not constant and when the redundant units are not homogeneous. Copyright (C) 2004 John Wiley Sons, Ltd.
引用
收藏
页码:679 / 697
页数:19
相关论文
共 12 条
[1]
[Anonymous], 1993, ESTIMATING DEVICE RE
[2]
BOWLES JB, 2000, RELIABILITY REV, V20, P26
[3]
BRIDAL O, 1997, MATH DEPENDABLE SYST, V2, P195
[4]
DOBBINS JG, 1997, PRIME RAS REPORT
[5]
DOBBINS JG, 1991, NCR J, V5, P44
[6]
PROBABILISTIC MODELS OF COMPUTER-SYSTEMS .1. (EXACT RESULTS) [J].
GELENBE, E ;
MUNTZ, RR .
ACTA INFORMATICA, 1976, 7 (01) :35-60
[7]
Kales P., 1998, RELIABILITY TECHNOLO
[8]
Kraft G, 1981, MICROPROGRAMMED CONT
[9]
Sahner R. A., 1996, PERFORMANCE RELIABIL
[10]
SEVICK KC, 1977, COMPUT PERFORM, P1