Optimal task allocation and hardware redundancy policies in distributed computing systems

被引:35
作者
Hsieh, CC [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Ind Management Sci, Tainan 70101, Taiwan
关键词
distributed computing system; hardware redundancy; genetic algorithms; system reliability;
D O I
10.1016/S0377-2217(02)00456-3
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
A distributed computing system (DCS) in general consists of processing nodes, communication channels, and tasks. Achieving a reliable DCS thus comprises three parts: the realization of reliable task processing, reliable communication among processing nodes, and a good task allocation strategy. In this study, we examine the relationship between system cost and system reliability in a cycle-free hardware-redundant DCS where multiple processors are available at each processing node and multiple communication links are available at each communication channel. Intuitively, higher hardware redundancy leads to higher system reliability which results in the reduction of communication cost. Such an endowment of hardware redundancy, however, incurs higher hardware operating cost. A unified model of system cost is therefore developed in this study that is a complex function of task allocation and hardware redundancy policies, and a hybrid genetic algorithm (HGA) based on genetic algorithms and a local search procedure is proposed to seek the optimal task allocation and hardware redundancy policies. The proposed algorithm is tested on randomly generated DCSs and compared with a simple genetic algorithm (SGA). The simulation results show that the HGA gives higher solution quality in less computational time than the SGA. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:430 / 447
页数:18
相关论文
共 27 条
[1]  
[Anonymous], 1975, Adaptation in neural and artificial systems
[2]  
[Anonymous], NEURAL FUZZY SYSTEMS
[3]  
[Anonymous], 1991, Handbook of genetic algorithms
[4]   The distributed program reliability analysis on star topologies [J].
Chang, MS ;
Chen, DJ ;
Lin, MS ;
Ku, KL .
COMPUTERS & OPERATIONS RESEARCH, 2000, 27 (02) :129-142
[5]   Optimal routing for distributed computing systems with data replication [J].
Chang, PY ;
Chen, DJ .
IEEE INTERNATIONAL COMPUTER PERFORMANCE AND DEPENDABILITY SYMPOSIUM - IPDS'96, PROCEEDINGS, 1996, :42-51
[6]   RELIABILITY ISSUES WITH MULTIPROCESSOR DISTRIBUTED DATABASE-SYSTEMS - A CASE-STUDY [J].
CHEN, CM ;
ORTIZ, JD .
IEEE TRANSACTIONS ON RELIABILITY, 1989, 38 (01) :153-158
[7]  
CHIU GM, 1988, P 7 ANN JOINT C IEEE, P1032
[8]   ESTIMATION OF INTERMODULE COMMUNICATION (IMC) AND ITS APPLICATIONS IN DISTRIBUTED-PROCESSING SYSTEMS [J].
CHU, WW ;
LAN, MT ;
HELLERSTEIN, J .
IEEE TRANSACTIONS ON COMPUTERS, 1984, 33 (08) :691-699
[9]  
Elsayed E.A., 1996, RELIABILITY ENG
[10]  
GEN M, 1997, GENETIC ALGORITHMS E