The Combinatorial BLAS: design, implementation, and applications

被引:205
作者
Buluc, Aydin [1 ]
Gilbert, John R. [1 ]
机构
[1] Univ Calif Berkeley, Lawrence Berkeley Lab, High Performance Comp Res, Berkeley, CA 94720 USA
基金
美国国家科学基金会;
关键词
Betweenness centrality; combinatorial BLAS; combinatorial scientific computing; graph analysis; Markov clustering; mathematical software; parallel graph library; software framework; sparse matrices; ALGORITHMS; MAPREDUCE;
D O I
10.1177/1094342011403516
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a scalable high-performance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of high-performance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse matrix methods. Graph computations are difficult to parallelize using traditional approaches due to their irregular nature and low operational intensity. Many graph computations, however, contain sufficient coarse-grained parallelism for thousands of processors, which can be uncovered by using the right primitives. We describe the parallel Combinatorial BLAS, which consists of a small but powerful set of linear algebra primitives specifically targeting graph and data mining applications. We provide an extensible library interface and some guiding principles for future development. The library is evaluated using two important graph algorithms, in terms of both performance and ease-of-use. The scalability and raw performance of the example applications, using the Combinatorial BLAS, are unprecedented on distributed memory clusters.
引用
收藏
页码:496 / 509
页数:14
相关论文
共 57 条
[41]  
JARVI J, 2003, P 2 INT C GEN PROGR, P228, DOI DOI 10.1007/978-3-540-39815-8_14
[42]  
Kepner J, 2009, SOFTW ENVIRON TOOLS, V21, P3
[43]  
KIRSCH C, 2010, 201007 U SALZB DEP C
[44]  
Lawson C. L., 1979, ACM Transactions on Mathematical Software, V5, P308, DOI 10.1145/355841.355848
[45]   CHALLENGES IN PARALLEL GRAPH PROCESSING [J].
Lumsdaine, Andrew ;
Gregor, Douglas ;
Hendrickson, Bruce ;
Berry, Jonathan .
PARALLEL PROCESSING LETTERS, 2007, 17 (01) :5-20
[46]  
Madduri K., 2009, P 3 WORKSH MULT ARCH
[47]   High performance remote memory access communication: The ARMCI approach [J].
Nieplocha, J. ;
Tipparaju, V. ;
Krishnan, M. ;
Panda, D. K. .
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2006, 20 (02) :233-253
[48]   Advances, applications and performance of the Global Arrays shared memory programming toolkit [J].
Nieplocha, Jarek ;
Palmer, Bruce ;
Tipparaju, Vinod ;
Krishnan, Manojkumar ;
Trease, Harold ;
Apra, Edoardo .
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2006, 20 (02) :203-231
[49]  
Pakin S., 2003, C SUP SC 03, DOI DOI 10.1109/SC.2003.10010
[50]  
ROBINSON E, 2011, GRAPH ALGORITHMS LAN