THE SCALABILITY OF FFT ON PARALLEL COMPUTERS

被引:54
作者
GUPTA, A
KUMAR, V
机构
[1] Department of Computer Science, University of Minnesota, Minneapolis., MN
关键词
COMMUNICATION OVERHEAD; EFFICIENCY; FAST FOURIER TRANSFORM; ISOEFFICIENCY FUNCTION; PARALLEL ARCHITECTURES; PARALLEL PROCESSING; SCALABILITY;
D O I
10.1109/71.238626
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper we present the scalability analysis of parallel Fast Fourier Transform algorithm on mesh and hypercube connected multicomputers using the isoefficiency metric. The isoefficiency function of an algorithm architecture combination is defined as the rate at which the problem size should grow with the number of processors to maintain a fixed efficiency. On the hypercube architecture. a commonly used parallel FFT algorithm can obtain linearly increasing speedup with respect to the number of processors with only a moderate increase in problem size. But there is a limit on the achievable efficiency and this limit is determined by the ratio of CPU speed and communication bandwidth of the hypercube channels. Efficiencies higher than this threshold value can be obtained if the problem size is increased very rapidly. If the hardware supports cut-through routing, then this threshold can also be overcome by using an alternate less scalable parallel formulation. The scalability analysis for the mesh connected multicomputers reveals that FFT cannot make efficient use of large-scale mesh architectures unless the bandwidth of the communication channels is increased as a function of the number of processors. We also show that it is more cost-effective to implement the FFT algorithm on a hypercube rather than a mesh despite the fact that large scale meshes are cheaper to construct than large hypercubes. Although the scope of this paper is limited to the Cooley-Tukey FFT algorithm on a few classes of architectures, the methodology can be used to study the performance of various FFT algorithms on a variety of architectures such as SIMD hypercube and mesh architectures and shared memory architecture.
引用
收藏
页码:922 / 932
页数:11
相关论文
共 46 条
[1]  
AGGARWAL A, 1989, COMMUNICATION COMPLE
[2]  
Aho A. V., 1974, DESIGN ANAL COMPUTER
[3]  
Akl S. G., 1989, DESIGN ANAL PARALLEL
[4]   A PARALLEL FFT ON AN MIMD MACHINE [J].
AVERBUCH, A ;
GABBER, E ;
GORDISSKY, B ;
MEDAN, Y .
PARALLEL COMPUTING, 1990, 15 (1-3) :61-74
[5]   FFTS IN EXTERNAL OR HIERARCHICAL MEMORY [J].
BAILEY, DH .
JOURNAL OF SUPERCOMPUTING, 1990, 4 (01) :23-35
[6]  
Bronson E. C., 1990, IEEE Transactions on Parallel and Distributed Systems, V1, P195, DOI 10.1109/71.80147
[7]  
BRSHADER S, 1989, 4TH P C HYP CONC COM, V1, P387
[9]  
DALLY WJ, 1987, VLSI ARCHITECTURE CO
[10]  
DALLY WJ, 1987, MAR P STANF C ADV RE, P391