SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters

被引：96

作者：

Gu, Rong ^{[1
]}

Yang, Xiaoliang ^{[1
]}

Yan, Jinshuang ^{[1
]}

Sun, Yuanhao ^{[2
]}

Wang, Bing ^{[2
]}

Yuan, Chunfeng ^{[1
]}

Huang, Yihua ^{[1
]}

机构：

[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China

[2] Intel Asia Pacific Res & Dev Ltd, Shanghai 200241, Peoples R China

来源：

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING | 2014年 / 74卷 / 03期

基金：

国家高技术研究发展计划(863计划);

关键词：

Parallel computing; MapReduce; Performance optimization; Distributed processing; Cloud computing;

D O I：

10.1016/j.jpdc.2013.10.003

中图分类号：

TP301 [理论、方法];

学科分类号：

080201 [机械制造及其自动化];

摘要：

As a widely-used parallel computing framework for big data processing today, the Hadoop MapReduce framework puts more emphasis on high-throughput of data than on low-latency of job execution. However, today more and more big data applications developed with MapReduce require quick response time. As a result, improving the performance of MapReduce jobs, especially for short jobs,. is of great significance in practice and has attracted more and more attentions from both academia and industry. A lot of efforts have been made to improve the performance of Hadoop from job scheduling or job parameter optimization level. In this paper, we explore an approach to improve the performance of the Hadoop MapReduce framework by optimizing the job and task execution mechanism. First of all, by analyzing the job and task execution mechanism in MapReduce framework we reveal two critical limitations to job execution performance. Then we propose two major optimizations to the MapReduce job and task execution mechanisms: first, we optimize the setup and cleanup tasks of a MapReduce job to reduce the time cost during the initialization and termination stages of the job; second, instead of adopting the loose heartbeat-based communication mechanism to transmit all messages between the JobTracker and TaskTrackers, we introduce an instant messaging communication mechanism for accelerating performance-sensitive task scheduling and execution. Finally, we implement SHadoop, an optimized and fully compatible version of Hadoop that aims at shortening the execution time cost of MapReduce jobs, especially for short jobs. Experimental results show that compared to the standard Hadoop, SHadoop can achieve stable performance improvement by around 25% on average for comprehensive benchmarks without losing scalability and speedup. Our optimization work has passed a production-level test in Intel and has been integrated into the Intel Distributed Hadoop (IDH). To the best of our knowledge, this work is the first effort that explores on optimizing the execution mechanism inside map/reduce tasks of a job. The advantage is that it can complement job scheduling optimizations to further improve the job execution performance. (C) 2013 Elsevier Inc. All rights reserved.

引用

页码：2166 / 2179

页数：14

共 24 条

[1]

[Anonymous], BENCHM STRESS TEST H

[2]

[Anonymous], 2011, P 2011 ACM SIGMOD IN

[3]

[Anonymous], 2008, 8 USENIX S OP SYST D

[4]

Babu S., 2011, P 1 ACM S CLOUD COMP, P137

[5]

Becerra Yolanda, 2009, Proceedings of the 2009 International Conference on Parallel Processing (ICPP 2009), P42, DOI 10.1109/ICPP.2009.59

[6]

Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137

[7]

Hammoud M., 2011, Proceedings of the 2011 IEEE 3rd International Conference on Cloud Computing Technology and Science (CloudCom 2011), P570, DOI 10.1109/CloudCom.2011.87

[8]

He C, 2011, INT CONF ACOUST SPEE, P3540

[9]

Hong Mao, 2011, 2011 IEEE/ACM International Conference on Green Computing and Communications, P28, DOI 10.1109/GreenCom.2011.13

[10]

Huang SS, 2010, I C DATA ENGIN WORKS, P41, DOI 10.1109/ICDEW.2010.5452747

← 1 2 3 →