On effective execution of nonuniform DOACROSS loops

被引:21
作者
Chen, DK [1 ]
Yew, PC [1 ]
机构
[1] UNIV MINNESOTA,DEPT COMP SCI,ST PAUL,MN 55455
基金
美国国家科学基金会;
关键词
compiler transformation; data dependence; loop parallelization; parallelism; scheduling; synchronization;
D O I
10.1109/71.503771
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
It is extremely difficult to parallelize DOACROSS loops with nonuniform loop-carried dependences. In this paper, we present a static scheduling scheme with an accompanying synchronization strategy that can execute such DOACROSS loops effectively and efficiently. Our approach uses one of the parallelization techniques called Dependence Uniformization, which finds a small set of uniform dependence vectors to cover all possible nonuniform dependences in a DOACROSS loop. It differs from the previous schemes in that we demonstrate a better way to select the uniform dependence vectors. When used with the Static Strip Scheduling scheme, the proposed uniform dependence vector set allows us to enforce dependences with more locality, which reduces the requirement of explicit synchronization considerably while retaining most of the parallelism. This paper describes the uniform dependence vectors selection strategy and the static strip scheduling scheme. The performance analysis and examples are also presented.
引用
收藏
页码:463 / 476
页数:14
相关论文
共 23 条
[1]  
ALLEN R, 1987, 14TH ANN ACM S PRINC, P63
[2]  
AMDAHL G, 1967, 3 P AFIPS SJCC, P483
[3]  
BANERJEE U, 1988, DEPENDENCE ANAL SUPE
[4]  
*BBN ADV COMP, 1987, BUTT PROD OV
[5]  
Chen D.-K., 1991, Proceedings Supercomputing '91 (Cat. No.91CH3058-5), P620
[6]  
CHEN DK, 1994, P SUP 1994 NOV, P518
[7]  
CHEN DK, 1994, THESIS U ILLINOIS UR
[8]  
CHEN Z, 1992, 9233 CACS TRU SW LOU
[9]  
Cytron R., 1986, Proceedings of the 1986 International Conference on Parallel Processing (Cat. No.86CH2355-6), P836
[10]   PARALLEL EXECUTION OF DO LOOPS [J].
LAMPORT, L .
COMMUNICATIONS OF THE ACM, 1974, 17 (02) :83-93