Triangular Matrix Inversion on Heterogeneous Multicore Systems

被引:13
作者
Ries, Florian [1 ]
De Marco, Tommaso [1 ]
Guerrieri, Roberto [1 ]
机构
[1] E De Castro ARCES, Adv Res Ctr Elect Syst Informat & Commun Technol, I-40123 Bologna, Italy
关键词
Matrix inversion; parallel processing; GRAPHICS;
D O I
10.1109/TPDS.2011.103
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Dense matrix inversion is a basic procedure in many linear algebra algorithms. Any factorization-based dense matrix inversion algorithm involves the inversion of one or two triangular matrices. In this work, we present an improved implementation of a parallel triangular matrix inversion for heterogeneous multicore CPU/dual-GPU systems.
引用
收藏
页码:177 / 184
页数:8
相关论文
共 20 条
[11]  
Harvey N., 2006, P 44 ALL ANN C COMM
[12]   SURVEY OF PARALLEL ALGORITHMS IN NUMERICAL LINEAR ALGEBRA [J].
HELLER, D .
SIAM REVIEW, 1978, 20 (04) :740-777
[13]   NVIDIA Tesla: A unified graphics and computing architecture [J].
Lindholm, Erik ;
Nickolls, John ;
Oberman, Stuart ;
Montrym, John .
IEEE MICRO, 2008, 28 (02) :39-55
[14]   Streaming algorithms for biological sequence alignment on GPUs [J].
Liu, Weiguo ;
Schmidt, Bertil ;
Voss, Gerrit ;
Mueller-Wittig, Wolfgang .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2007, 18 (09) :1270-1281
[15]   Optimal parallelization of a recursive algorithm for triangular matrix inversion on MIMD computers [J].
Nasri, W ;
Mahjoub, Z .
PARALLEL COMPUTING, 2001, 27 (13) :1767-1782
[16]  
Robert Yves., 1990, The impact of vector and parallel architectures on the Gaussian elimination algorithm
[17]   GAUSSIAN ELIMINATION IS NOT OPTIMAL [J].
STRASSEN, V .
NUMERISCHE MATHEMATIK, 1969, 13 (04) :354-&
[18]   Parallel implementation of the 2D discrete wavelet transform on Graphics Processing Units:: Filter Bank versus Lifting [J].
Tenllado, Christian ;
Setoain, Javier ;
Prieto, Manuel ;
Pinuel, Luis ;
Tirado, Francisco .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2008, 19 (03) :299-310
[19]   Towards dense linear algebra for hybrid GPU accelerated manycore systems [J].
Tomov, Stanimire ;
Dongarra, Jack ;
Baboulin, Marc .
PARALLEL COMPUTING, 2010, 36 (5-6) :232-240
[20]  
Volkov V., 2008, P 2008 ACM IEEE C SU, DOI DOI 10.1109/SC.2008.5214359