IMPLEMENTATION AND PERFORMANCE ISSUES OF A MASSIVELY-PARALLEL ATMOSPHERIC MODEL

被引:18
作者
HAMMOND, SW
LOFT, RD
DENNIS, JM
SATO, RK
机构
[1] Scientific Computing Division, National Center for Atmospheric Research, Boulder, CO 80307
基金
美国国家科学基金会;
关键词
ATMOSPHERIC GENERAL CIRCULATION MODELING; CLIMATE MODELING; DATA PARALLELISM; SPECTRAL TRANSFORM; SEMI-LAGRANGIAN TRANSPORT;
D O I
10.1016/0167-8191(95)01017-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present implementation and performance issues of a data parallel version of the National Center for Atmospheric Research (NCAR) Community Climate Model (CCM2). We describe automatic conversion tools used to aid in converting a production code written for a traditional vector architecture to data parallel code suitable for the Thinking Machines Corporation CM-5, Also, we describe the 3-D transposition method used to parallelize the spherical harmonic transforms in CCM2. This method employs dynamic data mapping techniques to improve data locality and parallel efficiency of these computations. We present performance data for the 3-D transposition method on the CM-5 for machine size up to 512 processors. We conclude that the parallel performance of the 3-D transposition method is adversely affected on the CM-5 by short vector lengths and array padding. We also find that the CM-5 spherical harmonic transforms spend about 70% of their execution time in communication. We detail a transposition-based data parallel implementation of the semi-Lagrangian Transport (SLT) algorithm used in CCM2. We analyze two approaches to parallelizing the SLT, called the departure point and arrival point based methods. We develop a performance model for choosing between these methods. We present SLT performance data which shows that the localized horizontal interpolation in the SLT takes 70% of the time, while the data remapping itself only require approximately 16%. We discuss the importance of scalable I/O to CCM2, and present the I/O rates measured on the CM-5. We compare the performance of the data parallel version of CCM2 on a 32-processor CM-5 with the optimized vector code running on a single processor Gray Y-MP. We show that the CM-5 code is 75% faster. We also give the overall performance of CCM2 running at higher resolutions on different numbers of CM-5 processors. We conclude by discussing the significance of these results and their implications for data parallel climate models.
引用
收藏
页码:1593 / 1619
页数:27
相关论文
共 23 条
[1]  
BARROS SRM, 1993, PARALLEL SUPERCOMPUT, P312
[2]  
BATH LM, 1992, NCAR TN382IA NAT CTR
[3]  
Dent D. W., 1993, Proceedings of the Fifth ECMWF Workshop on the Use of Parallel Processors in Meteorology. Parallel Supercomputing in Atmospheric Science, P73
[4]   DESIGN AND PERFORMANCE OF A SCALABLE PARALLEL COMMUNITY CLIMATE MODEL [J].
DRAKE, J ;
FOSTER, I ;
MICHALAKES, J ;
TOONEN, B ;
WORLEY, P .
PARALLEL COMPUTING, 1995, 21 (10) :1571-1591
[5]   COMPUTATIONAL DESIGN OF THE NCAR COMMUNITY CLIMATE MODEL [J].
HACK, JJ ;
ROSINSKI, JM ;
WILLIAMSON, DL ;
BOVILLE, BA ;
TRUESDALE, JE .
PARALLEL COMPUTING, 1995, 21 (10) :1545-1569
[6]  
HACK JJ, 1993, NCAR TN382STR NAT CT
[7]  
HAMMOND SW, 1994, 6TH P ECMWFS WORKSH
[8]  
HAMMOND SW, 1995, 7TH P SIAM C PAR PRO, P125
[9]  
Loft R. D., 1993, Proceedings of the Fifth ECMWF Workshop on the Use of Parallel Processors in Meteorology. Parallel Supercomputing in Atmospheric Science, P371
[10]  
RASCH PJ, 1990, Q J ROY METEOR SOC, V119, P1071