AN INTEGRATED RUNTIME AND COMPILE-TIME APPROACH FOR PARALLELIZING STRUCTURED AND BLOCK STRUCTURED APPLICATIONS

被引：25

作者：

AGRAWAL, G ^{[1
]}

SUSSMAN, A ^{[1
]}

SALTZ, J ^{[1
]}

机构：

[1] UNIV MARYLAND,UMIACS,COLLEGE PK,MD 20742

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 1995年 / 6卷 / 07期

基金：

美国国家科学基金会; 美国国家航空航天局;

关键词：

COMPILER SUPPORT; DISTRIBUTED MEMORY PARALLEL MACHINES; HIGH-PERFORMANCE FORTRAN; MULTIBLOCK CODES; MULTIGRID CODES; RUNTIME SUPPORT;

D O I：

10.1109/71.395403

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In compiling applications for distributed memory machines, runtime analysis is required when data to be communicated cannot be determined at compile-time. One such class of applications requiring runtime analysis is block structured codes, These codes employ multiple structured meshes, which may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems). In this paper, we present runtime and compile-time analysis for compiling such applications on distributed memory parallel machines in an efficient and machine-independent fashion. We have designed and implemented a runtime library which supports the runtime analysis required. The library is currently implemented on several different systems. We have also developed compiler analysis for determining data access patterns at compile-time and inserting calls to the appropriate runtime routines. Our methods can be used by compilers for HPF-like parallel programming languages in compiling codes in which data distribution, loop bounds and/or strides are unknown at compile-time. To demonstrate the efficacy of our approach, we have implemented our compiler analysis in the Fortran 90D/HPF compiler developed at Syracuse University, We have experimented with a multiblock Navier-Stokes solver template and a multigrid code. Our experimental results show that our primitives have low runtime communication overheads and the compiler parallelized codes perform within 20% of the codes parallelized by manually inserting calls to the runtime library.

引用

页码：747 / 754

页数：8

共 26 条

[11]

KOELBEL C, 1991, P SUPERCOMPUTING NOV

[12]

KOELBEL CH, 1994, HIGH PERFORMANCE FOR

[13]

KOHN SR, 1993, 6TH P SIAM C PAR PRO, P759

[14]

LEMKE M, 1992, GMD611 TECH REP

[15] COMPILING COMMUNICATION-EFFICIENT PROGRAMS FOR MASSIVELY PARALLEL MACHINES [J].

LI, JK ;

CHEN, M .

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1991, 2 (03) :361-376

[16]

McCormick SF, 1992, MULTILEVEL PROJECTIO

[17]

OVERMAN A, 1993, APR P 93 COPP MOUNT

[18] COMPILING FOR DISTRIBUTED-MEMORY ARCHITECTURES [J].

ROGERS, A ;

PINGALI, K .

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1994, 5 (03) :281-298

[19]

STICHNOTH JM, 1993, CMUCS93109 CARN MELL

[20]

TSENG CW, 1993, P SUPERCOMPUTING NOV, P338

← 1 2 3 →