Compressed bloom filters

被引:365
作者
Mitzenmacher, M [1 ]
机构
[1] Harvard Univ, Cambridge, MA 02138 USA
基金
美国国家科学基金会;
关键词
algorithms; computer networks; information theory; distributed computing; distributed information systems;
D O I
10.1109/TNET.2002.803864
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Although Bloom filters allow false positives, for many applications the space savings outweigh this drawback when the probability of an error is sufficiently low. We introduce compressed Bloomfilters, which improve performance when the Bloom filter is passed as a message, and its transmission size is a limiting factor. For example, Bloom filters have been suggested as a means for sharing Web cache information. In this setting, proxies do not share the exact contents of their caches, but instead periodically broadcast Bloom filters representing their caches. By using compressed Bloom filters, proxies can reduce the number of bits broadcast, the false positive probability, and/or the amount of computation per lookup. The cost is the processing time for compression and decompression, which can use simple arithmetic coding, and more memory use at the proxies, which utilize the larger uncompressed form of the Bloom filter.
引用
收藏
页码:604 / 612
页数:9
相关论文
共 19 条
[1]  
Adler M, 1998, RANDOM STRUCT ALGOR, V13, P159, DOI 10.1002/(SICI)1098-2418(199809)13:2<159::AID-RSA3>3.0.CO
[2]  
2-Q
[3]  
ALON N, 1992, PROBABILISTIC METHOD
[4]  
[Anonymous], 2001, ACM
[5]  
Bell T. C., 1999, Managing Gigabytes, V2nd ed
[6]   SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS [J].
BLOOM, BH .
COMMUNICATIONS OF THE ACM, 1970, 13 (07) :422-&
[7]  
CARPINELLI J, 1995, SOURCE CODE ARITHMET
[8]  
CARTER L, 1979, J CSS, V18, P143, DOI DOI 10.1016/0022-0000(79)90044-8
[9]  
CZERWINSKI SE, 1999, P 5 ANN ACM IEEE INT, P24
[10]  
FAN L, 1999, 1361 U WISC MAD COMP