Massively parallel computing using commodity components

被引:52
作者
Brightwell, R
Fisk, LA
Greenberg, DS
Hudson, T
Levenhagen, M
Maccabe, AB
Riesen, R
机构
[1] Sandia Natl Labs, Scalable Comp Syst, Albuquerque, NM 87185 USA
[2] CCS, IDA, Bowie, MD USA
[3] Univ New Mexico, Dept Comp Sci, Albuquerque, NM 87131 USA
关键词
massively parallel; workstation cluster; Beowulf; distributed memory; message passing; MPI;
D O I
10.1016/S0167-8191(99)00104-0
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The Computational Plant (Cplant) project at Sandia National Laboratories is developing a large-scale, massively parallel computing resource from a cluster of commodity computing and networking components. We are combining the benefits of commodity cluster computing with our expertise in designing, developing, using, and maintaining large-scale, massively parallel processing (MPP) machines. In this paper, we present the design goals of the cluster and an approach to developing a commodity-based computational resource capable of delivering: performance comparable to production-level MPP machines. We provide a description of the hardware components of a 96-node Phase I prototype machine and discuss the experiences with the prototype that led to the hardware choices for a 400-node Phase II production machine. We give a detailed description of the management and runtime software components of the cluster and offer computational performance data as well as performance measurements of functions that are critical to the management of large systems. (C) 2000 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:243 / 266
页数:24
相关论文
共 22 条
[1]  
*ASCI, 1997, ASCI PATHF PROGR DES
[2]  
Brightwell R., 1996, Proceedings. Second MPI Developer's Conference, P18, DOI 10.1109/MPIDC.1996.534090
[3]  
BRIGHTWELL R, 1997, SAND972519
[4]  
Buyya R., 1999, HIGH PERFORMANCE CLU, V1
[5]  
*COMP MICR INT, 1997, VIRT ARCH SPEC VERS
[6]  
DONGARRA JJ, LINPACK USERS GUIDE
[7]  
GREENBERG DS, 1997, P SC 97
[8]  
HENLEY G, 1997, MSSUEIRSERC973 MISS
[9]  
Hertel E., 1995, ShockWaves@ Marseille I, P377, DOI DOI 10.1007/978-3-642-78829-1_61
[10]  
KATRAMATOS D, 1997, CROSS OPERATING SYST