A comparison of three programming models for adaptive applications on the Origin2000

被引:22
作者
Shan, HZ
Singh, JP
Oliker, L
Biswas, R
机构
[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
[2] Univ Calif Berkeley, Lawrence Berkeley Lab, Natl Energy Res Sci Comp Ctr, Berkeley, CA 94720 USA
[3] NASA, Adv Supercomp Div, Ames Res Ctr, Moffett Field, CA 94035 USA
基金
美国国家科学基金会;
关键词
parallel programming; shared address space; message passing; dynamic mesh adaptation; N-body problem;
D O I
10.1006/jpdc.2001.1777
中图分类号
TP301 [理论、方法];
学科分类号
081202 [计算机软件与理论];
摘要
Adaptive applications have computational workloads and communication patterns that change unpredictably at runtime, requiring dynamic load balancing to achieve scalable performance on parallel machines. Efficient parallel implementations of such adaptive applications is therefore a challenging task. In this paper, we compare the performance of and the programming effort required for two major classes of adaptive applications under three leading parallel programming models on an SGI Origin2000 system, a machine that supports all three models efficiently. Results indicate that the three models deliver comparable performance; however, the implementations differ significantly beyond merely using explicit messages versus implicit loads/stores even though the basic parallel algorithms are similar. Compared with the message-passing (using MPI) and SHMEM programming models, the cache-coherent shared address space (CC-SAS) model provides substantial ease of programming at both the conceptual and program orchestration levels, often accompanied by performance gains. However, CC-SAS currently has portability limitations and may suffer from poor spatial locality of physically distributed shared data on large numbers of processors. (C) 2002 Elsevier Science (USA).
引用
收藏
页码:241 / 266
页数:26
相关论文
共 24 条
[1]
A COMPARISON OF SHARED AND NONSHARED MEMORY MODELS OF PARALLEL COMPUTATION [J].
ANDERSON, RJ ;
SNYDER, L .
PROCEEDINGS OF THE IEEE, 1991, 79 (04) :480-487
[2]
[Anonymous], 1996, Proceedings of the 10th International Conference on Supercomputing, ICS'96, page, DOI [10.1145/237578.237590, DOI 10.1145/237578.237590]
[3]
A HIERARCHICAL O(N-LOG-N) FORCE-CALCULATION ALGORITHM [J].
BARNES, J ;
HUT, P .
NATURE, 1986, 324 (6096) :446-449
[4]
A NEW PROCEDURE FOR DYNAMIC ADAPTION OF 3-DIMENSIONAL UNSTRUCTURED GRIDS [J].
BISWAS, R ;
STRAWN, RC .
APPLIED NUMERICAL MATHEMATICS, 1994, 13 (06) :437-452
[5]
Parallel multilevel k-way partitioning scheme for irregular graphs [J].
Karypis, G ;
Kumar, V .
SIAM REVIEW, 1999, 41 (02) :278-300
[6]
Karypis G, PARMETIS PARALLEL GR
[7]
LIU P, 1994, P 6 ANN ACM S PAR AL, P122
[8]
MARTONOSI M, 1989, PROCEEDINGS OF THE 1989 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, VOL 3, P88
[9]
Ngo T. A., 1992, Proceedings. Scalable High Performance Computing Conference SHPCC-92 (Cat. No.92TH0432-5), P284, DOI 10.1109/SHPCC.1992.232630
[10]
PLUM: Parallel load balancing for adaptive unstructured meshes [J].
Oliker, L ;
Biswas, R .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1998, 52 (02) :150-177