Recursion leads to automatic variable blocking for dense linear-algebra algorithms

被引:129
作者
Gustavson, FG [1 ]
机构
[1] IBM Corp, Div Res, TJ Watson Res Ctr, Yorktown Heights, NY 10598 USA
关键词
D O I
10.1147/rd.416.0737
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We describe some modifications of the LAPACK dense linear-algebra algorithms using recursion. Recursion leads to automatic variable blocking. LAPACK's level-2 versions transform into level-3 codes by using recursion. The new recursive codes are written in FORTRAN 77, which does not support recursion as a language feature. Gaussian elimination with partial pivoting and Cholesky factorization are considered. Very clear algorithms emerge with the use of recursion. The recursive codes do exactly the same computation as the LAPACK codes, and a single recursive code replaces both the level-2 and level-3 versions of the corresponding LAPACK codes. We present an analysis of the recursive algorithm in terms of both FLOP count and storage usage. The matrix operands are more "squarish" using recursion. The total area of the submatrices used in the recursive algorithm is less than the total area used by the LAPACK level-3 right-/left-looking algorithms. We quantify the difference; we also quantify how the FLOPS are computed. Also, we show that the algorithms exhibit high performance on RISC-type processors. In fact, except for small matrices, the recursive version outperforms the level-3 LAPACK versions of DGETRF and DPOTRF on an RS/6000(TM) workstation. For the level-2 versions,the performance gain approaches a factor of 3. We also demonstrate that a change to the LAPACK DLASWP routine can improve the performance of both the recursive version and DGETRF by more than 15 percent.
引用
收藏
页码:737 / 755
页数:19
相关论文
共 10 条
[1]   EXPLOITING FUNCTIONAL PARALLELISM OF POWER2 TO DESIGN HIGH-PERFORMANCE NUMERICAL ALGORITHMS [J].
AGARWAL, RC ;
GUSTAVSON, FG ;
ZUBAIR, M .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1994, 38 (05) :563-576
[2]  
ANDERSON E, 1990, 20 LAPACK
[3]  
Dongarra Jack J., 1991, SOLVING LINEAR SYSTE
[4]   IMPLEMENTING LINEAR ALGEBRA ALGORITHMS FOR DENSE MATRICES ON A VECTOR PIPELINE MACHINE [J].
DONGARRA, JJ ;
GUSTAVSON, FG ;
KARP, A .
SIAM REVIEW, 1984, 26 (01) :91-112
[5]  
DONGARRA JJ, 1997, CS8985 U TENN COMP C
[6]  
DONGARRA JJ, 1979, LINPACK USERS GUIDE
[7]  
Golub GH, 1989, MATRIX COMPUTATIONS
[8]  
*IBM BRANCH OFF, 1994, ENG SCI SUBR LIBR VE, V1
[9]  
Toledo S., 1996, RC20344 IBM TJ WATS
[10]  
TOLEDO S, IN PRESS SIAM J MATR