Scheduling divisible MapReduce computations

被引:50
作者
Berlinska, J. [2 ]
Drozdowski, M. [1 ]
机构
[1] Poznan Univ Tech, Inst Comp Sci, PL-60965 Poznan, Poland
[2] Adam Mickiewicz Univ, Fac Math & Comp Sci, PL-61614 Poznan, Poland
关键词
Parallel processing; MapReduce; Scheduling; Divisible loads; Performance evaluation; DISTRIBUTED COMPUTATION; TREE NETWORKS; LOADS;
D O I
10.1016/j.jpdc.2010.12.004
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper we analyze MapReduce distributed computations as a divisible load scheduling problem. The two operations of mapping and reducing can be understood as two divisible applications with precedence constraints. A divisible load model of the computation, and two load partitioning algorithms are proposed. Performance limits of MapReduce computations are investigated. To our best knowledge this is the first time that processing applications with precedence constraints have been considered on the grounds of divisible load theory. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:450 / 459
页数:10
相关论文
共 26 条
[21]  
Lin J., 2010, DATA INTENSIVE TEXT
[22]  
Ranger C, 2007, INT S HIGH PERF COMP, P13
[23]   Ten reasons to use divisible load theory [J].
Robertazzi, TG .
COMPUTER, 2003, 36 (05) :63-+
[24]   Optimizing computing costs using divisible load analysis [J].
Sohn, J ;
Robertazzi, TG ;
Luryi, S .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1998, 9 (03) :225-234
[25]  
VANDERRAADT K, 2005, P 19 IPDPS 05, pB29
[26]  
YANG Y, 2007, 6096 INRIA RHON ALP