The redundancy and distribution of the phrase lengths of the fixed-database Lempel-Ziv algorithm

被引:22
作者
Wyner, AJ
机构
[1] Department of Statistics, University of California, Berkeley
基金
美国国家科学基金会;
关键词
data compression; Lempel-Ziv algorithm; Stein's method; string matching;
D O I
10.1109/18.623144
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The fixed-database version of the Lempel-Ziv algorithm closely resembles many versions that appear in practice, In this paper, we ascertain several key asymptotic properties of the algorithm as applied to sources with finite memory, First, we determine that for a dictionary of size n, the algorithm achieves a redundancy rho(n) = H log log n/log n + o (log log n/log n) where H is the entropy of the process, This is the first, nontrivial, lower bound on any Lempel-Ziv-type compression scheme, We then find the limiting distribution and all moments of the lengths of the phrases by comparing them to a random-walk-like variable with well-known behavior.
引用
收藏
页码:1452 / 1464
页数:13
相关论文
共 17 条
[1]  
BARBOUR, 1992, POISSON APPROX
[2]  
Billingsley P, 1968, CONVERGE PROBAB MEAS
[3]  
Chung K. L., 1967, MARKOV CHAINS STATIO
[4]   THE ENTROPY OF A RANDOMLY STOPPED SEQUENCE [J].
EKROOT, L ;
COVER, TM .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1991, 37 (06) :1641-1644
[5]  
Hoel P.G., 1972, INTRO STOCHASTIC PRO
[6]  
JACQUET P, 1994, J COMB THEORY, V66
[7]   AVERAGE PROFILE AND LIMITING DISTRIBUTION FOR A PHRASE SIZE IN THE LEMPEL-ZIV PARSING ALGORITHM [J].
LOUCHARD, G ;
SZPANKOWSKI, W .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1995, 41 (02) :478-488
[8]  
LOUCHARD G, 1997, IEEE T INFORM THEORY, V43, P1
[9]   UPPER-BOUNDS ON THE PROBABILITY OF SEQUENCES EMITTED BY FINITE-STATE SOURCES AND ON THE REDUNDANCY OF THE LEMPEL-ZIV ALGORITHM [J].
PLOTNIK, E ;
WEINBERGER, MJ ;
ZIV, J .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1992, 38 (01) :66-72
[10]  
SAVARI A, 1997, IEEE T INFORM THEORY, V43