Event-based lossy compression for effective and efficient OLAP over data streams

被引:24
作者
Cuzzocrea, Alfredo [1 ,2 ]
Chakravarthy, Sharma [3 ]
机构
[1] CNR, ICAR, I-87036 Cosenza, Italy
[2] Univ Calabria, I-87036 Cosenza, Italy
[3] Univ Texas Arlington, Arlington, TX 76019 USA
关键词
Data stream query processing; Data stream compression methodologies and techniques; Knowledge discovery from data streams; OLAP over data streams; Event-based data stream processing; Event-based data stream compression; ACTIVE DATABASE; ODE;
D O I
10.1016/j.datak.2010.02.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An innovative event-based lossy compression model for effective and efficient OLAP over data streams, called ECM-DS, is presented and experimentally assessed in this paper. The main novelty of our compression approach with respect to traditional data stream compression techniques relies on exploiting the semantics of the reference application scenario in order to drive the compression process by means of the "degree of interestingness" of events occurring in the target stream. This finally improves the quality of retrieved approximate answers to OLAP queries over data streams, and, in turn, the quality of complex knowledge discovery tasks over data streams developed on top of ECM-DS, and implemented via ad-hoc data stream mining algorithms. Overall, the compression strategy we propose in this research puts the basis for a novel class of intelligent applications over data streams where the knowledge on actual streams is integrated-with and correlated-to the knowledge related to expired events that are considered critical for the target OLAP analysis scenario. Finally, a comprehensive experimental evaluation over several classes of data stream sets clearly confirms the benefits deriving from the event-based data stream compression approach proposed in ECM-DS. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:678 / 708
页数:31
相关论文
共 89 条
[1]   Aurora: a new model and architecture for data stream management [J].
Abadi, DJ ;
Carney, D ;
Cetintemel, U ;
Cherniack, M ;
Convey, C ;
Lee, S ;
Stonebraker, M ;
Tatbul, N ;
Zdonik, S .
VLDB JOURNAL, 2003, 12 (02) :120-139
[2]  
ADAIKKALAVAN R, 2003, P E EUR C ADV DAT IN
[3]  
ADAIKKALAVAN R, 2004, P E EUR C ADV DAT IN
[4]  
Aggarwal C., 2004, ACM SIGKDD
[5]  
[Anonymous], 1998, ART COMPUTER PROGRAM
[6]  
[Anonymous], ACTIVE RULES DATABAS
[7]  
[Anonymous], 2003, P C INN DAT SYST RES
[8]  
BABCOCK B, 2004, P INT C DAT ENG MARC
[9]  
BABCOCK B, 2002, ACM PODS
[10]  
Babu S., 2001, ACM SIGMOD RECORD