Semantics and Implementation of Continuous Sliding Window Queries over Data Streams

被引:57
作者
Kraemer, Juergen [1 ]
Seeger, Bernhard [1 ]
机构
[1] Univ Marburg, Dept Math & Comp Sci, D-35032 Marburg, Germany
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2009年 / 34卷 / 01期
关键词
Algorithms; Semantics; data streams; continuous queries; query optimization;
D O I
10.1145/1508857.1508861
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years the processing of continuous queries over potentially infinite data streams has attracted a lot of research attention. We observed that the majority of work addresses individual stream operations and system-related issues rather than the development of a general-purpose basis for stream processing systems. Furthermore, example continuous queries are often formulated in some declarative query language without specifying the underlying semantics precisely enough. To overcome these deficiencies, this article presents a consistent and powerful operator algebra for data streams which ensures that continuous queries have well-defined, deterministic results. In analogy to traditional database systems, we distinguish between a logical and a physical operator algebra. While the logical algebra specifies the semantics of the individual operators in a descriptive but concrete way over temporal multisets, the physical algebra provides efficient implementations in the form of stream-to-stream operators. By adapting and enhancing research from temporal databases to meet the challenging requirements in streaming applications, we are able to carry over the conventional transformation rules from relational databases to stream processing. For this reason, our approach not only makes it possible to express continuous queries with a sound semantics, but also provides a solid foundation for query optimization, one of the major research topics in the stream community. Since this article seamlessly explains the steps from query formulation to query execution, it outlines the innovative features and operational functionality implemented in our state-of-the-art stream processing infrastructure.
引用
收藏
页数:49
相关论文
共 59 条
  • [1] Aurora: a new model and architecture for data stream management
    Abadi, DJ
    Carney, D
    Cetintemel, U
    Cherniack, M
    Convey, C
    Lee, S
    Stonebraker, M
    Tatbul, N
    Zdonik, S
    [J]. VLDB JOURNAL, 2003, 12 (02) : 120 - 139
  • [2] [Anonymous], 1993, Temporal Databases: Theory, Design, and Implementation.
  • [3] The CQL continuous query language: semantic foundations and query execution
    Arasu, A
    Babu, S
    Widom, J
    [J]. VLDB JOURNAL, 2006, 15 (02) : 121 - 142
  • [4] Arasu A., 2002, PROC PODS, P221
  • [5] Babcock B., 2002, Proceedings of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS'02, page, P1, DOI DOI 10.1145/543613.543615
  • [6] BABCOCK B, 2003, P 2003 ACM SIGMOD IN, P253, DOI DOI 10.1145/872757.872789
  • [7] Babu S, 2005, PROC INT CONF DATA, P118
  • [8] Bai Yijian., 2006, CIKM, P337
  • [9] BARGA RS, 2007, P 3 BIENN C INN DAT, P363
  • [10] A cost-based approach to adaptive resource management in data stream systems
    Cammert, Michael
    Kraemer, Jurgen
    Seeger, Bernhard
    Vaupel, Sonny
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (02) : 230 - 245