Renumbering unstructured grids to improve the performance of codes on hierarchical memory machines

被引:37
作者
Burgess, DA [1 ]
Giles, MB [1 ]
机构
[1] UNIV OXFORD, COMP LAB, NUMER ANAL GRP, OXFORD OX1 3QD, ENGLAND
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1016/S0965-9978(96)00039-7
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The performance of unstructured grid codes on workstations and distributed memory parallel computers is substantially affected by the efficiency of the memory hierarchy. This efficiency essentially depends on the order of computation and numbering of the grid. Most grid generators do not take into account the effect of the memory hierarchy when producing grids so application programmers must renumber grids to improve the performance of their codes. To design a good renumbering scheme a detailed runtime analysis of the data movement in an application code is needed. Thus, a memory hierarchy simulator has been developed to analyse the effect of existing renumbering schemes such as bandwidth reduction, the Greedy method, colouring, random numbering and the original numbering produced by the grid generator. The renumbering is applied to either vertices, edges, faces or cells and two algorithms are proposed to consistently renumber the other entities used in the solver. The simulated and actual timings show that bandwidth reduction and Greedy methods give the best performance on IBM RS/6000, SGI Indy, SGI Indigo and SGI Power Challenge machines for three-dimensional Poissons's, Maxwell's and the Euler equations solvers. The improvement in performance is over a factor of two for applications with large grids and a high ratio of memory-accesses to computation. This factor is even higher for memory hierarchies with small caches. (C) 1997 Elsevier Science Limited.
引用
收藏
页码:189 / 201
页数:13
相关论文
共 24 条
[1]  
BACON DF, 1993, UCBCSD93891
[2]   MICROPROCESSORS - FROM DESKTOPS TO SUPERCOMPUTERS [J].
BASKETT, F ;
HENNESSY, JL .
SCIENCE, 1993, 261 (5123) :864-871
[3]  
BELL R, 1991, GG24361101 IBM
[4]  
BELL R, 1994, COMMUNICATION FEB
[5]  
Cuthill E, 1969, P 1969 24 NAT C, P157, DOI [DOI 10.1145/800195.805928, 10.1145/800195.805928]
[6]  
DAS R, 1992, 9212 ICASE
[7]   THE EFFECT OF ORDERING ON PRECONDITIONED CONJUGATE GRADIENTS [J].
DUFF, IS ;
MEURANT, GA .
BIT, 1989, 29 (04) :635-657
[8]   A SIMPLE AND EFFICIENT AUTOMATIC FEM DOMAIN DECOMPOSER [J].
FARHAT, C .
COMPUTERS & STRUCTURES, 1988, 28 (05) :579-602
[9]  
GEORGE A, 1971, STANCS71208
[10]   ALGORITHM FOR REDUCING BANDWIDTH AND PROFILE OF A SPARSE MATRIX [J].
GIBBS, NE ;
POOLE, WG ;
STOCKMEYER, PK .
SIAM JOURNAL ON NUMERICAL ANALYSIS, 1976, 13 (02) :236-250