Runtime-aware adaptive scheduling in stream processing

被引:12
作者
Liu, Yuan [1 ]
Shi, Xuanhua [1 ]
Jin, Hai [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Serv Comp Technol & Syst Lab, Cluster & Grid Comp Lab, Wuhan 430074, Peoples R China
关键词
stream processing; load balancing; schedule; real time; runtime; dynamic;
D O I
10.1002/cpe.3661
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Long-running stream applications usually share the same fundamental computational infrastructure. To improve the efficiency of data processing in stream processing systems, a data analysis operator could be partitioned into n parallel tasks. The partitioned tasks are usually deployed on m nodes coexisting with other application operators. Because the node performance can vary in unpredictable ways ( i.e., ( 1) stream input rates may fluctuate and ( 2) computational resource availability varies as other applications are affected), the nodes have different processing steps, and the slow node determines the operator performance. Hence, the tasks should be redistributed at runtime for stream applications to meet their strict latency requirements. Our key idea is to redistribute the tasks to the best node dynamically adaptive to resource or load fluctuations. In this paper, we present a runtime-aware adaptive schedule mechanism that aims at minimizing the operator processing latency and minimizing the latency difference between different nodes' tasks. We propose a new abstraction called performance cost ratio ( PCR) that evaluates the node performance. The higher the node's PCR is, the less cost the node will pay for processing one tuple, and the more tasks should be deployed on it. In a scheduling, we first sort tasks descendingly by their loads and sort nodes by their PCR. Then we reassign the amount of computation according to the node's PCR to keep the node's PCR and its input rate the same or in similar proportion in all PCRs. The PCR-based quantitative algorithm applies itself to make tasks loads quantized to the processing capacity of nodes, move the minimum amount of operator's tasks, and keep the tasks local at the same time. We have implemented a runtime-aware adaptive scheduler as an extension to Storm and evaluated this strategy. We achieve the optimization goal using less computational resources. Copyright (C) 2015 John Wiley & Sons, Ltd.
引用
收藏
页码:3830 / 3843
页数:14
相关论文
共 24 条
[1]  
Abadi DJ., 2005, CIDR, V5, P277
[2]   MillWheel: Fault-Tolerant Stream Processing at Internet Scale [J].
Akidau, Tyler ;
Balikov, Alex ;
Bekiroglu, Kaya ;
Chernyak, Slava ;
Haberman, Josh ;
Lax, Reuven ;
McVeety, Sam ;
Mills, Daniel ;
Nordstrom, Paul ;
Whittle, Sam .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (11) :1033-1044
[3]   Microsoft CEP Server and Online Behavioral Targeting [J].
Ali, M. H. ;
Gerea, C. ;
Raman, B. S. ;
Sezgin, B. ;
Tarnavski, T. ;
Verona, T. ;
Wang, P. ;
Zabback, P. ;
Ananthanarayan, A. ;
Kirilov, A. ;
Lu, M. ;
Raizman, A. ;
Krishnan, R. ;
Schindlauer, R. ;
Grabs, T. ;
Bjeletich, S. ;
Chandramouli, B. ;
Goldstein, J. ;
Bhat, S. ;
Li, Ying ;
Di Nicola, V. ;
Wang, X. ;
Maier, David ;
Grell, S. ;
Nano, O. ;
Santos, I. .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (02) :1558-1561
[4]   Scale-up Strategies for Processing High-Rate Data Streams in System S [J].
Andrade, Henrique ;
Gedik, Bugra ;
Wu, Kun-Lung ;
Yu, Philip S. .
ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, :1375-+
[5]  
[Anonymous], 2011, 14 INT C EXT DAT TEC, DOI DOI 10.1145/1951365.1951432
[6]  
[Anonymous], 2013, Proceedings of the 8th ACM European Conference on Computer Systems, DOI DOI 10.1145/2465351.2465353
[7]  
[Anonymous], 2003, SIGMOD
[8]  
Balazinska Magdalena., 2004, Proceedings of the 1st Symposium on Networked Systems Design and Implementation - Volume, V1, P15
[9]  
Fernandez R. Castro, 2013, P ACM SIGMOD INT C M, P725, DOI [DOI 10.1145/2463676.2465282, 10.1145/2463676.2465282]
[10]   TCP performance re-visited [J].
Foong, AP ;
Huff, TR ;
Hum, HH ;
Patwardhan, JP ;
Regnier, GJ .
ISPASS: 2003 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, 2003, :70-79