Ephemeral Materialization Points in Stratosphere Data Management on the Cloud

被引:2
作者
Hoeger, Mareike [1 ]
Kao, Odej [1 ]
Richter, Philipp [1 ]
Warneke, Daniel [1 ]
机构
[1] Tech Univ Berlin, Berlin, Germany
来源
CLOUD COMPUTING AND BIG DATA | 2013年 / 23卷
关键词
Fault tolerance; Map Reduce; Cloud Computing; MAPREDUCE; RECOVERY;
D O I
10.3233/978-1-61499-322-3-163
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data streaming frameworks like stratosphere[1] are designed to work in the cloud on a large number of parallel working nodes. The increase of nodes together with the expected long run-time of data processing tasks causes an increase of failure probability. Therefore fault tolerance becomes an important issue in these systems. Existing fault tolerance strategies for data streaming systems usually accept full restarts or work in a blocking manner. In this paper we introduce ephemeral materialization points, a non blocking materialization strategy in data streaming systems. This strategy selects materialization positions uncoordinated during run-time. The materialization decision is taken depending on the resource usage and the execution graph to minimize the expected recovery time in case of a failure. We show how and when to reach a decision whether to materialize or not, and which information could influence the decision.
引用
收藏
页码:163 / 181
页数:19
相关论文
共 23 条
  • [1] Battre D., 2010, SoCC, P119, DOI [10.1145/1807128.1807148, DOI 10.1145/1807128.1807148]
  • [2] Bhargava B., 1988, Proceedings. Seventh Symposium on Reliable Distributed Systems (IEEE Cat. No.88CH2612-0), P3, DOI 10.1109/RELDIS.1988.25775
  • [3] Borkar V, 2011, PROC INT CONF DATA, P1151, DOI 10.1109/ICDE.2011.5767921
  • [4] SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets
    Chaiken, Ronnie
    Jenkins, Bob
    Larson, Per-Ake
    Ramsey, Bill
    Shakib, Darren
    Weaver, Simon
    Zhou, Jingren
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02): : 1265 - 1276
  • [5] Graph Twiddling in a MapReduce World
    Cohen, Jonathan
    [J]. COMPUTING IN SCIENCE & ENGINEERING, 2009, 11 (04) : 29 - 41
  • [6] Condie T., 2010, P 2010 ACM SIGMOD IN, P1115, DOI DOI 10.1145/1807167.1807295
  • [7] Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
  • [8] Dean J, 2006, EXPERIENCES MAPREDUC
  • [9] A survey of rollback-recovery protocols in message-passing systems
    Elnozahy, EN
    Alvisi, L
    Wang, YM
    Johnson, DB
    [J]. ACM COMPUTING SURVEYS, 2002, 34 (03) : 375 - 408
  • [10] Ghazal A, 2003, LECT NOTES COMPUT SC, V2736, P782