Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems

被引:9
作者
Blagojevic, Filip [1 ,2 ]
Nikolopoulos, Dimitrios S. [1 ,2 ]
Stamatakis, Alexandros [3 ]
Antonopoulos, Christos D. [4 ]
Curtis-Maury, Matthew [1 ,2 ]
机构
[1] Virginia Tech, Dept Comp Sci, Blacksburg, VA 24061 USA
[2] Virginia Tech, Ctr High End Comp Syst, Blacksburg, VA 24061 USA
[3] Ecole Polytech Fed Lausanne, Sch Comp & Commun Sci, CH-1015 Lausanne, Switzerland
[4] Univ Thessaly, Dept Comp & Commun Engn, Volos 38221, Greece
基金
美国国家科学基金会;
关键词
heterogeneous multi-core processors; accelerator-based parallel architectures; runtime systems for parallel programming; Cell broadband engine;
D O I
10.1016/j.parco.2007.09.004
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We explore runtime mechanisms and policies for scheduling dynamic multi-grain parallelism on heterogeneous multicore processors. Heterogeneous multi-core processors integrate conventional cores that run legacy codes with specialized cores that serve as computational accelerators. The term multi-grain parallelism refers to the exposure of multiple dimensions of parallelism from within the runtime system, so as to best exploit a parallel architecture with heterogeneous computational capabilities between its cores and execution units. We investigate user-level schedulers that dynamically "rightsize" the dimensions and degrees of parallelism on the cell broadband engine. The schedulers address the problem of mapping application-specific concurrency to an architecture with multiple hardware layers of parallelism, without requiring programmer intervention or sophisticated compiler support. We evaluate recently introduced schedulers for event-driven execution and utilization-driven dynamic multi-grain parallelization on Cell. We also present a new scheduling scheme for dynamic multi-grain parallelism, S-MGPS, which uses sampling of dominant execution phases to converge to the optimal scheduling algorithm. We evaluate S-MGPS on an IBM Cell BladeCenter with two realistic bioinformatics applications that infer large phylogenies. S-MGPS performs within 2-10% of the optimal scheduling algorithm in these applications, while exhibiting low overhead and little sensitivity to application-dependent parameters. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:700 / 719
页数:20
相关论文
共 26 条
[1]  
BADER D, 2007, P 21 INT PAR DISTR P
[2]   Industrial applications of high-performance computing for phylogeny reconstruction [J].
Bader, DA ;
Moret, BME ;
Vawter, L .
COMMERCIAL APPLICATIONS FOR HIGH-PERFORMANCE COMPUTING, 2001, 4528 :159-168
[3]  
BELLENS P, 2006, P SUP 2006 TAMP FL N
[4]  
BENTHIN C, 2006, P 2006 IEEE S INT RA
[5]  
BLAGOJEVIC F, 2007, P 21 IEEE ACM INT PA
[6]  
Blagojevic F, 2007, PROCEEDINGS OF THE 2007 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING PPOPP'07, P90
[7]  
Chen T., 2006, P 19 INT WORKSH LANG
[8]  
CHEN T, 2005, CELL BROADBAND ENGIN
[9]  
Eichenberger A., 2005, OPTIMIZING COMPILER
[10]  
FATAHALIAN K, 2006, P SUP 2006 TAMP FL N