MillWheel: Fault-Tolerant Stream Processing at Internet Scale

被引:291
作者
Akidau, Tyler [1 ]
Balikov, Alex [1 ]
Bekiroglu, Kaya [1 ]
Chernyak, Slava [1 ]
Haberman, Josh [1 ]
Lax, Reuven [1 ]
McVeety, Sam [1 ]
Mills, Daniel [1 ]
Nordstrom, Paul [1 ]
Whittle, Sam [1 ]
机构
[1] Google, Mountain View, CA 94043 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2013年 / 6卷 / 11期
关键词
D O I
10.14778/2536222.2536229
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
MillWheel is a framework for building low-latency data-processing applications that is widely used at Google. Users specify a directed computation graph and application code for individual nodes, and the system manages persistent state and the continuous flow of records, all within the envelope of the framework's fault-tolerance guarantees. This paper describes MillWheel's programming model as well as its implementation. The case study of a continuous anomaly detector in use at Google serves to motivate how many of MillWheel's features are used. MillWheel's programming model provides a notion of logical time, making it simple to write time-based aggregations. MillWheel was designed from the outset with fault tolerance and scalability in mind. In practice, we find that MillWheel's unique combination of scalability, fault tolerance, and a versatile programming model lends itself to a wide variety of problems at Google.
引用
收藏
页码:1033 / 1044
页数:12
相关论文
共 30 条
[1]   Aurora: a new model and architecture for data stream management [J].
Abadi, DJ ;
Carney, D ;
Cetintemel, U ;
Cherniack, M ;
Convey, C ;
Lee, S ;
Stonebraker, M ;
Tatbul, N ;
Zdonik, S .
VLDB JOURNAL, 2003, 12 (02) :120-139
[2]  
Abadi DJ, 2005, CIDR, V5, P277
[3]  
Adya Atul, 2010, NSDI, V10, P1
[4]  
Babcock B, 2002, PROC 21 ACM SIGMODSI, P1, DOI DOI 10.1145/543613.543615
[5]  
Chandrasekaran S., 2003, SIGMOD, P668
[6]   Bigtable: A distributed storage system for structured data [J].
Chang, Fay ;
Dean, Jeffrey ;
Ghemawat, Sanjay ;
Hsieh, Wilson C. ;
Wallach, Deborah A. ;
Burrows, Mike ;
Chandra, Tushar ;
Fikes, Andrew ;
Gruber, Robert E. .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2008, 26 (02)
[7]  
Condie T., 2009, TECHNICAL REPORT
[8]  
Corbett J., 2012, OSDI, P1
[9]  
Cranor C, 2002, P ACM SIGMOD INT C M, P623
[10]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137