A multigrain Delaunay mesh generation method for multicore SMT-based architectures

被引:23
作者
Antonopoulos, Christos D. [2 ]
Blagojevic, Filip [4 ]
Chernikov, Andrey N. [1 ]
Chrisochoides, Nikos P. [1 ]
Nikolopoulos, Dimitrios S. [3 ]
机构
[1] Coll William & Mary, Dept Comp Sci, Williamsburg, VA 23187 USA
[2] Univ Thessaly, Dept Comp & Commun Engn, Volos, Greece
[3] Virginia Tech, Dept Comp Sci, Blacksburg, VA 24061 USA
[4] Lawrence Berkeley Natl Lab, Berkeley, CA 94720 USA
基金
美国国家科学基金会;
关键词
Parallel; Mesh Generation; Delaunay; Multigrain; Multicore; SMT; REFINEMENT; PROCESSOR; ALGORITHM;
D O I
10.1016/j.jpdc.2009.03.009
中图分类号
TP301 [理论、方法];
学科分类号
080201 [机械制造及其自动化];
摘要
Given the proliferation of layered, multicore- and SMT-based architectures, it is imperative to deploy and evaluate important, multi-level, scientific computing codes, such as meshing algorithms, on these systems. We focus on Parallel Constrained Delaunay Mesh (PCDM) generation. We exploit coarse-grain parallelism at the subdomain level, medium-grain at the cavity level and fine-grain at the element level. This multi-grain data parallel approach targets clusters built from commercially available SMTs and multicore processors. The exploitation of the coarser degree of granularity facilitates scalability both in terms of execution time and problem size on loosely-coupled clusters. The exploitation of medium-grain parallelism allows performance improvement at the single node level. Our experimental evaluation shows that the first generation of SMT cores is not capable of taking advantage of fine-grain parallelism in PCDM. Many of our experimental findings with PCDM extend to other adaptive and irregular multigrain parallel algorithms as well. Published by Elsevier Inc.
引用
收藏
页码:589 / 600
页数:12
相关论文
共 58 条
[1]
Almasi G., 2003, SIGARCH COMPUT ARCHI, V31, P26
[2]
[Anonymous], 1998, Delaunay Triangulation and Meshing
[3]
[Anonymous], 2002, INTEL TECHNOLOGY J
[4]
Antonopoulos C.D., 2005, Proc. 19th Annual International Conference on Supercomputing, P367
[5]
ANTONOPOULOS CD, 2009, J PARA IN PRESS MAR
[6]
A load balancing framework for adaptive and asynchronous applications [J].
Barker, K ;
Chernikov, A ;
Chrisochoides, N ;
Pingali, K .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2004, 15 (02) :183-192
[7]
Barroso LA, 2000, PROCEEDING OF THE 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, P282, DOI [10.1109/ISCA.2000.854398, 10.1145/342001.339696]
[8]
Blelloch G. E., 1996, Proceedings of the Twelfth Annual Symposium on Computational Geometry, FCRC '96, P186, DOI 10.1145/237218.237357
[9]
Design and implementation of a practical parallel Delaunay algorithm [J].
Blelloch, GE ;
Hardwick, JC ;
Miller, GL ;
Talmor, D .
ALGORITHMICA, 1999, 24 (3-4) :243-269
[10]
COMPUTING DIRICHLET TESSELLATIONS [J].
BOWYER, A .
COMPUTER JOURNAL, 1981, 24 (02) :162-166