An asynchronous parallel supernodal algorithm for sparse Gaussian elimination

被引:163
作者
Demmel, JW [1 ]
Gilbert, JR
Li, XYS
机构
[1] Univ Calif Berkeley, Div Comp Sci, Berkeley, CA 94720 USA
[2] Xerox Corp, Palo Alto Res Ctr, Palo Alto, CA 94304 USA
[3] Univ Calif Berkeley, Lawrence Berkeley Lab, Natl Energy Res Sci Comp, Berkeley, CA 94720 USA
关键词
sparse Gaussian elimination; unsymmetric linear systems; supernodes; parallelism; dynamic scheduling and load balancing;
D O I
10.1137/S0895479897317685
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Although Gaussian elimination with partial pivoting is a robust algorithm to solve unsymmetric sparse linear systems of equations, it is difficult to implement efficiently on parallel machines because of its dynamic and somewhat unpredictable way of generating work and intermediate results at run time. In this paper, we present an efficient parallel algorithm that overcomes this difficulty. The high performance of our algorithm is achieved through (1) using a graph reduction technique and a supernode-panel computational kernel for high single processor utilization, and (2) scheduling two types of parallel tasks for a high level of concurrency. One such task is factoring the independent panels in the disjoint subtrees of the column elimination tree of A. Another task is updating a panel by previously computed supernodes. A scheduler assigns tasks to free processors dynamically and facilitates the smooth transition between the two types of parallel tasks. No global synchronization is used in the algorithm. The algorithm is well suited for shared memory machines (SMP) with a modest number of processors. We demonstrate 4- to 7-fold speedups on a range of 8 processor SMPs, and more on larger SMPs. One realistic problem arising from a 3-D flow calculation achieves factorization rates of 1.0, 2.5, 0.8, and 0.8 gigaflops on the 12 processor Power Challenge, 8 processor Cray C90, 16 processor Cray J90, and 8 processor AlphaServer 8400.
引用
收藏
页码:915 / 952
页数:38
相关论文
共 34 条
[1]  
AMESTOY P, 1991, THPA912 CERFACS
[2]  
AMESTOY PR, 1994, MUPS PARALLEL PACKAG
[3]   THE INFLUENCE OF RELAXED SUPERNODE PARTITIONS ON THE MULTIFRONTAL METHOD [J].
ASHCRAFT, C ;
GRIMES, R .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1989, 15 (04) :291-309
[4]  
BILMES J, 1996, CS96326 U TENN COMP
[5]  
*CRAY RES INC, CRAY C90 SER
[6]  
*CRAY RES INC, CRAY J90 SER
[7]   A NONDETERMINISTIC PARALLEL ALGORITHM FOR GENERAL UNSYMMETRIC SPARSE LU FACTORIZATION [J].
DAVIS, TA ;
YEW, PC .
SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 1990, 11 (03) :383-402
[8]   A supernodal approach to sparse partial pivoting [J].
Demmel, JW ;
Eisenstat, SC ;
Gilbert, JR ;
Li, XYS ;
Liu, JWH .
SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 1999, 20 (03) :720-755
[9]   AN EXTENDED SET OF FORTRAN BASIC LINEAR ALGEBRA SUBPROGRAMS [J].
DONGARRA, JJ ;
DUCROZ, J ;
HAMMARLING, S ;
HANSON, RJ .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1988, 14 (01) :1-17
[10]  
DONGARRA JJ, 1990, ACM T MATH SOFTWARE, V16, P1, DOI 10.1145/77626.79170