VISPARK: GPU-ACCELERATED DISTRIBUTED VISUAL COMPUTING USING SPARK

被引:7
作者
Choi, Woohyuk [1 ]
Hong, Sumin [1 ]
Jeong, Won-Ki [1 ]
机构
[1] Ulsan Natl Inst Sci & Technol, Sch Elect & Comp Engn, Ulsan, South Korea
基金
新加坡国家研究基金会;
关键词
MapReduce; GPU; distributed computing; visualization; domain-specific language; MAPREDUCE FRAMEWORK;
D O I
10.1137/15M1026407
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
With the growing need of big-data processing in diverse application domains, MapReduce (e.g., Hadoop) has become one of the standard computing paradigms for large-scale computing on a cluster system. Despite its popularity, the current MapReduce framework suffers from inflexibility and inefficiency inherent to its programming model and system architecture. In order to address these problems, we propose Vispark, a novel extension of Spark for GPU-accelerated MapReduce processing on array-based scientific computing and image processing tasks. Vispark provides an easy-to-use, Python-like high-level language syntax and a novel data abstraction for MapReduce programming on a GPU cluster system. Vispark introduces a programming abstraction for accessing neighbor data in the mapper function, which greatly simplifies many image processing tasks using MapReduce by reducing memory footprints and bypassing the reduce stage. Vispark provides socket-based halo communication that synchronizes between data partitions transparently from the users, which is necessary for many scientific computing problems in distributed systems. Vispark also provides domain-specific functions and language supports specifically designed for high-performance computing and image processing applications. We demonstrate the performance of our prototype system on several visual computing tasks, such as image processing, volume rendering, K-means clustering, and heat transfer simulation.
引用
收藏
页码:S700 / S719
页数:20
相关论文
共 26 条
  • [1] Abbasi A, 2012, CSI INT SYMP COMPUT, P178, DOI 10.1109/CADS.2012.6316441
  • [2] [Anonymous], 2012, NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
  • [3] Appuswamy R., 2013, P 4 ANN S CLOUD COMP
  • [4] Spark SQL: Relational Data Processing in Spark
    Armbrust, Michael
    Xin, Reynold S.
    Lian, Cheng
    Huai, Yin
    Liu, Davies
    Bradley, Joseph K.
    Meng, Xiangrui
    Kaftan, Tomer
    Franklint, Michael J.
    Ghodsi, Ali
    Zaharia, Matei
    [J]. SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 1383 - 1394
  • [5] Grex: An efficient MapReduce framework for graphics processing units
    Basaran, Can
    Kang, Kyoung-Don
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (04) : 522 - 533
  • [6] DOWN THE PETABYTE HIGHWAY
    Brumfiel, Geoff
    [J]. NATURE, 2011, 469 (7330) : 282 - 283
  • [7] Buck J., 2013, P INT C HIGH PERF CO, P73
  • [8] Buck J.B., 2011, P 2011 ICHPC PAGE, P66
  • [9] Catanzaro B., 2008, WORKSH SOFTW TOOLS M
  • [10] Vivaldi: A Domain-Specific Language for Volume Processing and Visualization on Distributed Heterogeneous Systems
    Choi, Hyungsuk
    Choi, Woohyuk
    Quan, Tran Minh
    Hildebrand, David G. C.
    Pfister, Hanspeter
    Jeong, Won-Ki
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2014, 20 (12) : 2407 - 2416