A HIGH-PERFORMANCE MATRIX-MULTIPLICATION ALGORITHM ON A DISTRIBUTED-MEMORY PARALLEL COMPUTER, USING OVERLAPPED COMMUNICATION

被引:42
作者
AGARWAL, RC
GUSTAVSON, FG
ZUBAIR, M
机构
关键词
D O I
10.1147/rd.386.0673
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a scheme for matrix-matrix multiplication on a distributed-memory parallel computer. The scheme hides almost all of the communication cost with the computation and uses the standard, optimized Level-3 BLAS operation on each node. As a result, the overall performance of the scheme is nearly equal to the performance of the Level-3 optimized BLAS operation times the number of nodes in the computer, which is the peak performance obtainable for parallel BLAS. Another feature of our algorithm is that it can give peak performance for larger matrices, even if the underlying communication network of the computer is slow.
引用
收藏
页码:673 / 681
页数:9
相关论文
共 17 条
[1]  
AGARWAL RC, 1989, P IFIP WG25 WORKING, P217
[2]  
CHOI J, 1993, ORNL TM12252 MATH SC
[3]   PUMMA - PARALLEL UNIVERSAL MATRIX MULTIPLICATION ALGORITHMS ON DISTRIBUTED-MEMORY CONCURRENT COMPUTERS [J].
CHOI, JY ;
DONGARRA, JJ ;
WALKER, DW .
CONCURRENCY-PRACTICE AND EXPERIENCE, 1994, 6 (07) :543-570
[4]   PARALLEL MATRIX AND GRAPH ALGORITHMS [J].
DEKEL, E ;
NASSIMI, D ;
SAHNI, S .
SIAM JOURNAL ON COMPUTING, 1981, 10 (04) :657-675
[5]  
DEMMEL JW, 1993, LAPACK60 U TENN WORK
[6]   A SET OF LEVEL 3 BASIC LINEAR ALGEBRA SUBPROGRAMS - MODEL IMPLEMENTATION AND TEST PROGRAMS [J].
DONGARRA, JJ ;
DUCROZ, J ;
HAMMARLING, S ;
DUFF, I .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1990, 16 (01) :18-28
[7]  
FOX GC, 1987, PARALLEL COMPUT, P17
[8]   IMPACT OF HIERARCHICAL MEMORY-SYSTEMS ON LINEAR ALGEBRA ALGORITHM DESIGN [J].
GALLIVAN, K ;
JALBY, W ;
MEIER, U ;
SAMEH, AH .
INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1988, 2 (01) :12-48
[9]   PARALLEL ALGORITHMS FOR DENSE LINEAR ALGEBRA COMPUTATIONS [J].
GALLIVAN, KA ;
PLEMMONS, RJ ;
SAMEH, AH .
SIAM REVIEW, 1990, 32 (01) :54-135
[10]  
GUSTAVSON FG, 1993, RC1869481769 IBM TJ