MODELS OF BITMAP GENERATION - A SYSTEMATIC-APPROACH TO BITMAP COMPRESSION

被引:4
作者
BOOKSTEIN, A
KLEIN, ST
机构
[1] BAR ILAN UNIV,DEPT ECON & BA,IL-52900 RAMAT GAN,ISRAEL
[2] BAR ILAN UNIV,DEPT MATH & COMP SCI,IL-52900 RAMAT GAN,ISRAEL
关键词
D O I
10.1016/0306-4573(92)90065-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In large IR systems, information about word occurrence may be stored in the form of a bit matrix, with rows corresponding to different words and columns to documents. Such a matrix is generally very large and very sparse. New methods for compressing such matrices are presented, which exploit possible correlations between rows and between columns. The methods are based on partitioning the matrix into small blocks and predicting the 1-bit distribution within a block by means of various bit generation models. Each block is then encoded using Huffman or arithmetic coding. The methods also use a new way of enumerating subsets of fixed size from a given superset. Preliminary experimental results indicate improvements over previous methods.
引用
收藏
页码:735 / 748
页数:14
相关论文
共 20 条
[1]   ROBUST TRANSMISSION OF UNBOUNDED STRINGS USING FIBONACCI REPRESENTATIONS [J].
APOSTOLICO, A ;
FRAENKEL, AS .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1987, 33 (02) :238-245
[2]  
BELL T, 1989, COMPUT SURV, V21, P557, DOI 10.1145/76894.76896
[3]  
BOOKSTEIN A, 1991, P DATA COMPRESSION B, P402
[4]  
BOOKSTEIN A, 1990, 13TH P ACM SIGIR C B, P327
[5]  
BOOKSTEIN A, 1991, 14TH P ACM SIGIR C, P63
[6]  
CHOUEKA Y, 1986, 9TH P ACM SIGIR C PI, P88
[7]  
COVER TM, 1973, IEEE T INFORM THEORY, V19, P73, DOI 10.1109/TIT.1973.1054929
[8]  
ELIAS P, 1975, IEEE T INFORM THEORY, V21, P194, DOI 10.1109/TIT.1975.1055349
[9]  
FELLER W, 1950, INTRO PROBABILITY TH, V1
[10]  
HAMMING R, 1980, CODING INFORMATION T