EVALUATING COMPILER OPTIMIZATIONS FOR FORTRAN-D

被引:29
作者
HIRANANDANI, S [1 ]
KENNEDY, K [1 ]
TSENG, CW [1 ]
机构
[1] STANFORD UNIV, COMP SYST LAB, STANFORD, CA 94305 USA
关键词
D O I
10.1006/jpdc.1994.1040
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The Fortran D compiler uses data decomposition specifications to automatically translate Fortran programs for execution on MIMD distributed-memory machines. This paper introduces and classifies a number of advanced optimizations needed to achieve acceptable performance; they are analyzed and empirically evaluated for stencil computations. Communication optimizations reduce communication overhead by decreasing the number of messages and hide communication overhead by overlapping the cost of remaining messages with local computation. Parallelism optimizations exploit parallel and pipelined computations and may need to restructure the computation to increase parallelism. Profitability formulas are derived for each optimization. Empirical results show that exploiting parallelism for pipelined computations, reductions, and scans is vital. Message vectorization, collective communication, and efficient coarse-grain pipelining also significantly affect performance. Scalability of communication and parallelism optimizations are analyzed. The effectiveness of communication optimizations is dictated by the ratio of communication to computation in the program. An optimization strategy is developed based on these analyses. (C) 1994 Academic Press, Inc.
引用
收藏
页码:27 / 45
页数:19
相关论文
共 50 条
[1]   AUTOMATIC TRANSLATION OF FORTRAN PROGRAMS TO VECTOR FORM [J].
ALLEN, R ;
KENNEDY, K .
ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1987, 9 (04) :491-542
[2]  
AMARASINGHE S, 1993, JUN P SIGPLAN 93 C P
[3]  
BALASUNDARAM V, 1991, 5TH P SIAM C PAR PRO
[4]  
BALASUNDARAM V, 1991, 3RD P ACM SIGPLAN S
[5]  
BALASUNDARAM V, 1990, 5TH P DISTR MEM COMP
[6]  
BOKHARI SH, 1991, ICASE914 I COMP APPL
[7]  
BRANDES T, 1992, AHR924 GMD HIGH PERF
[8]  
BROMLEY M, 1991, JUN P SIGPLAN 91 C P
[9]  
BURNS C, 1992, NOV P SUP 92 MINN
[10]  
Callahan D., 1988, Journal of Supercomputing, V2, P151, DOI 10.1007/BF00128175