Scalable parallel FFT for spectral simulations on a Beowulf cluster

被引:37
作者
Dmitruk, P [1 ]
Wang, LP
Matthaeus, WH
Zhang, R
Seckel, D
机构
[1] Univ Delaware, Bartol Res Inst, Sharp Lab 217, Newark, DE 19716 USA
[2] Univ Delaware, Dept Mech Engn, Newark, DE 19716 USA
[3] EXA Corp, Lexington, MA USA
基金
美国国家科学基金会; 美国国家航空航天局;
关键词
scalability; parallel FFT; spectral simulation; Beowulf clusters; message passing;
D O I
10.1016/S0167-8191(01)00120-X
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The implementation and performance of the multidimensional Fast Fourier Transform (FFT) on a distributed memory Beowulf cluster is examined. We focus on the three-dimensional (3D) real transform, an essential computational component of Galerkin and pseudo-spectral codes. The approach studied is a 1D domain decomposition algorithm that relies on communication-intensive transpose operation involving P processors. Communication is based upon the standard portable message passing interface (MPI). We show that 1/P scaling for execution time at fixed problem size N3 (i.e., linear speedup) can be obtained provided that (1) the transpose algorithm is optimized for simultaneous block communication by all processors; and (2) communication is arranged for non-overlapping pairwise communication between processors, thus eliminating blocking when standard fast ethernet interconnects are employed. This method provides the basis for implementation of scalable and efficient spectral method computations of hydrodynamic and magneto-hydrodynamic turbulence on Beowulf clusters assembled from standard commodity components. An example is presented using a 3D passive scalar code. © 2001 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:1921 / 1936
页数:16
相关论文
共 25 条
[1]  
[Anonymous], 1995, Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
[2]  
[Anonymous], 1986, NUMERICAL RECIPES C
[3]   Parallel computing on clusters of workstations [J].
Atiquzzaman, M ;
Srimani, PK .
PARALLEL COMPUTING, 2000, 26 (2-3) :175-177
[4]  
Averbuch A, 1998, CONCURRENCY-PRACT EX, V10, P583, DOI 10.1002/(SICI)1096-9128(199807)10:8<583::AID-CPE327>3.0.CO
[5]  
2-3
[6]   THE DYNAMICS OF FREELY DECAYING TWO-DIMENSIONAL TURBULENCE [J].
BRACHET, ME ;
MENEGUZZI, M ;
POLITANO, H ;
SULEM, PL .
JOURNAL OF FLUID MECHANICS, 1988, 194 :333-349
[7]   Massively parallel computing using commodity components [J].
Brightwell, R ;
Fisk, LA ;
Greenberg, DS ;
Hudson, T ;
Levenhagen, M ;
Maccabe, AB ;
Riesen, R .
PARALLEL COMPUTING, 2000, 26 (2-3) :243-266
[8]   Implementation of parallel FFT algorithms on distributed memory machines with a minimum overhead of communication [J].
Calvin, C .
PARALLEL COMPUTING, 1996, 22 (09) :1255-1279
[9]  
Canuto C., 2012, Spectral Methods: Fundamentals in Single Domains
[10]   REGULATORY MUTANTS AND TRANSCRIPTIONAL CONTROL OF THE SERRATIA-MARCESCENS EXTRACELLULAR NUCLEASE GENE [J].
CHEN, YC ;
SHIPLEY, GL ;
BALL, TK ;
BENEDIK, MJ .
MOLECULAR MICROBIOLOGY, 1992, 6 (05) :643-651