Continuously monitoring top-k uncertain data streams: a probabilistic threshold method

被引:17
作者
Hua, Ming [1 ]
Pei, Jian [1 ]
机构
[1] Simon Fraser Univ, Burnaby, BC V5A 1S6, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Uncertain streams; Probabilistic threshold top-k queries; Query processing; SELECTION; QUERIES;
D O I
10.1007/s10619-009-7043-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
080201 [机械制造及其自动化];
摘要
Recently, uncertain data processing has become more and more important. Although a significant amount of previous research explores various continuous queries on data streams, continuous queries on uncertain data streams have seldom been investigated. In this paper, we formulate a novel and challenging problem of continuously monitoring top-k uncertain data streams, and propose a probabilistic threshold method. We develop four algorithms systematically: a deterministic exact algorithm, a randomized method, and their space-efficient versions using quantile summaries. An extensive empirical study using real data sets and synthetic data sets is reported to verify the effectiveness and the efficiency of our methods.
引用
收藏
页码:29 / 65
页数:37
相关论文
共 57 条
[1]
Abiteboul S., 1987, SIGMOD Record, V16, P34, DOI 10.1145/38714.38724
[2]
Angluin D., 1977, Proceedings of the ninth annual ACM symposium on Theory of computing (STOC '77), P30
[3]
[Anonymous], 2007, P 26 ACM SIGACT SIGM, DOI DOI 10.1145/1265530.1265571
[4]
[Anonymous], 2005, P 31 INT C VERY LARG
[5]
[Anonymous], SIGMOD
[6]
[Anonymous], 2004, Proceedings of the Thirtieth international conference on Very large data bases-Volume
[7]
[Anonymous], 2003, Proc. of the ACM SIGMOD International Conference on Management of Data, DOI DOI 10.1145/872757
[8]
[Anonymous], 2008, P ACM SIGMOD INT C M, DOI DOI 10.1145/1376616.1376744
[9]
ANTOVA L, 2007, P 2007 IEEE 23 INT C
[10]
ARASU A, 2004, P 23 ACM SIGACT SIGM