An improved data stream summary: the count-min sketch and its applications

被引:1227
作者
Cormode, G [1 ]
Muthukrishnan, S
机构
[1] Rutgers State Univ, Ctr Discrete Math & Comp Sci, DIMACS, Piscataway, NJ USA
[2] Rutgers State Univ, Div Comp & Informat Syst, Piscataway, NJ USA
来源
JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC | 2005年 / 55卷 / 01期
基金
美国国家科学基金会;
关键词
D O I
10.1016/j.jalgor.2003.12.001
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We introduce a new sublinear space data structure-the count-min sketch-for summarizing data streams. Our sketch allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition, it can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc. The time and space bounds we show for using the CM sketch to solve these problems significantly improve those previously known-typically from 1/epsilon(2) to 1/epsilon in factor. (c) 2003 Elsevier Inc. All rights reserved.
引用
收藏
页码:58 / 75
页数:18
相关论文
共 31 条
[1]   The space complexity of approximating the frequency moments [J].
Alon, N ;
Matias, Y ;
Szegedy, M .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1999, 58 (01) :137-147
[2]  
Alon N., 1999, Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, P10, DOI 10.1145/303976.303978
[3]  
[Anonymous], P 1998 INT C VER LAR
[4]  
Babcock B., 2002, Proceedings of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), P1, DOI DOI 10.1145/543613.543615
[5]  
Bar-Yossef Z., 2002, P 6 INT WORKSH RAND, P1
[6]  
Charikar M., 2002, P 29 INT C AUT LANG, P693, DOI 10.1007/3-540-45465-9_59
[7]   Comparing data streams using Hamming norms (How to zero in) [J].
Cormode, G ;
Datar, M ;
Indyk, P ;
Muthukrishnan, S .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (03) :529-540
[8]  
Cormode G., 2004, P IEEE INFOCOM
[9]  
Cormode G., 2003, ACM Transactions on Database Systems (TODS), P296, DOI DOI 10.1145/1061318.1061325
[10]  
Cormode Graham, 2003, P VLDB, P464