A theory of fault-tolerant routing in wormhole networks

被引:53
作者
Duato, J
机构
[1] Facultad de Informática, Universidad Politécnica de Valencia, 46071, Valencia
关键词
adaptive routing; channel redundancy; fault-tolerant routing; interconnection networks; network redundancy; wormhole switching;
D O I
10.1109/71.605766
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Fault-tolerant systems aim at providing continuous operation in the presence of faults. Multicomputers rely on an interconnection network between processors to support the message-passing mechanism. Therefore, the reliability of the interconnection network is very important for the reliability of the whole system. This paper analyzes the effective redundancy available in a wormhole network by combining connectivity and deadlock freedom. Redundancy is defined at the channel level. We propose a sufficient condition for channel redundancy, also computing the set of redundant channels. The redundancy level of the network is also defined, proposing a theorem that supplies its value. This theory is developed on lap of our necessary and sufficient condition for deadlock-free adaptive routing. The new theory also considers the failure of physical channels when virtual channels are used. Finally, we propose a methodology for the design of fault-tolerant routing algorithms, showing its application to n-dimensional meshes.
引用
收藏
页码:790 / 802
页数:13
相关论文
共 45 条
[2]  
ALLEN JD, 1994, P 21 ANN INT S COMP
[3]   MULTICOMPUTERS - MESSAGE-PASSING CONCURRENT COMPUTERS [J].
ATHAS, WC ;
SEITZ, CL .
COMPUTER, 1988, 21 (08) :9-24
[4]  
Bolding K., 1994, Parallel Computer Routing and Communication. First International Workshop, PCRCW '94. Proceedings, P226
[5]  
Boppana R. V., 1994, Proceedings Supercomputing '94 (Cat. No.94CH34819), P693, DOI 10.1109/SUPERC.1994.344335
[6]   FAULT-TOLERANT WORMHOLE ROUTING ALGORITHMS FOR MESH NETWORKS [J].
BOPPANA, RV ;
CHALASANI, S .
IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (07) :848-864
[7]  
BOPPANA RV, 1993, P 20 ANN INT S COMP
[8]  
CHALASANI S, 1994, P 8 INT C SUP JUL
[9]  
Chen M.-S., 1990, IEEE Transactions on Parallel and Distributed Systems, V1, P152, DOI 10.1109/71.80143
[10]   ADAPTIVE FAULT-TOLERANT ROUTING IN HYPERCUBE MULTICOMPUTERS [J].
CHEN, MS ;
SHIN, KG .
IEEE TRANSACTIONS ON COMPUTERS, 1990, 39 (12) :1406-1416