INVESTIGATING THE PERFORMANCE OF PARALLEL EIGENSOLVERS FOR LARGE PROCESSOR COUNTS

被引:12
作者
LITTLEFIELD, RJ [1 ]
MASCHHOFF, KJ [1 ]
机构
[1] UNIV WASHINGTON, SEATTLE, WA 98195 USA
来源
THEORETICA CHIMICA ACTA | 1993年 / 84卷 / 4-5期
关键词
EIGENSOLVING; MASSIVELY PARALLEL COMPUTERS; SMALL DENSE MATRICES;
D O I
10.1007/BF01113282
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Eigensolving (diagonalizing) small dense matrices threatens to become a bottleneck in the application of massively parallel computers to electronic structure methods. Because the computational cost of electronic structure methods typically scales as O(N3) or worse, even teraflop computer systems with thousands of processors will often confront problems with N much less than 10,000. At present, diagonalizing an N x N matrix on P processors is not efficient when P is large compared to N. The loss of efficiency can make diagonalization a bottleneck on a massively parallel computer, even though it is typically a minor operation on conventional serial machines. This situation motivates a search for both improved methods and identification of the computer characteristics that would be most productive to improve. In this paper, we compare the performance of several parallel and serial methods for solving dense real symmetric eigensystems on a distributed memory message passing parallel computer. We focus on matrices of size N = 200 and processor counts P = 1 to P = 512, with execution on the Intel Touchstone DELTA computer. The best eigensolver method is found to depend on the number of available processors. Of the methods tested, a recently developed Blocked Factored Jacobi (BFJ) method is the slowest for small P, but the fastest for large P. Its speed is a complicated non-monotonic function of the number of processors used. A detailed performance analysis of the BFJ method shows that: (1) the factor most responsible for limited speedup is communication startup cost; (2) with current communication costs, the maximum achievable parallel speedup is modest (one order of magnitude) compared to the best serial method; and (3) the fastest solution is often achieved by using less than the maximum number of available processors.
引用
收藏
页码:457 / 473
页数:17
相关论文
共 10 条
[1]  
[Anonymous], 1986, NUMERICAL RECIPES
[2]  
EBERLEIN PJ, 1987, 2ND P C HYP MULT, P605
[3]  
EGGERS WS, 1988, THESIS SUNY BUFFALO
[4]  
Golub G.H., 1996, MATH GAZ, VThird
[5]   SOLVING THE SYMMETRIC TRIDIAGONAL EIGENVALUE PROBLEM ON THE HYPERCUBE [J].
IPSEN, ICF ;
JESSUP, ER .
SIAM JOURNAL ON SCIENTIFIC AND STATISTICAL COMPUTING, 1990, 11 (02) :203-229
[6]  
JESSUP ER, 1989, YALEUDCSRR728 YAL U
[7]  
LITTLEFIELD RJ, 1991, 6TH P DISTR MEM COMP, P600
[8]  
MOYER SA, 1991, IPCTR91007 U VIRG TE
[9]  
1990, IPSC2 IPSC860 USERS
[10]  
1991, TOUCHSTONE DELTA SYS