On the performance of data compression algorithms based upon string matching

被引:65
作者
Yang, EH [1 ]
Kieffer, JC
机构
[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
[2] Univ Minnesota, Dept Elect Engn, Minneapolis, MN 55455 USA
基金
美国国家科学基金会;
关键词
data compression algorithms; phi-mixing sources; rate distortion function; string matching;
D O I
10.1109/18.650987
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Lossless and lossy data compression algorithms based on string matching are considered. In the lossless case, a result of Wyner and Ziv is extended, In the lossy case, a data compression algorithm based on approximate string matching is analyzed in the following two frameworks: 1) the database and the source together form a Markov chain of finite order; 2) the database and the source are independent with the database coming from a Markov model and the source from a general stationary, ergodic model. In either framework, it is shown that the resulting compression rate converges with probability one to a quantity computable as the infimum of an information-theoretic functional over a set of auxiliary random variables; the quantity is strictly greater than the rate distortion function of the source except in some symmetric cases. In particular, this result implies that the lossy algorithm proposed by Steinberg and Gutman is not optimal, even for memoryless or Markov sources.
引用
收藏
页码:47 / 65
页数:19
相关论文
共 24 条
[1]  
[Anonymous], 1971, RATE DISTORTION THEO
[2]  
Billingsley P., 2013, CONVERGE PROBAB MEAS
[3]  
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[4]  
CSISZAR I, 1981, INFORMATION THEORY C
[5]  
ELIAS P, 1975, IEEE T INFORM THEORY, V21, P194, DOI 10.1109/TIT.1975.1055349
[6]   ERGODICITY OF MARKOV CHANNELS [J].
GRAY, RM ;
DUNHAM, MO ;
GOBBI, RL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1987, 33 (05) :656-664
[7]   DISCRETE-TIME DETECTION IN PHI-MIXING NOISE [J].
HALVERSON, DR ;
WISE, GL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1980, 26 (02) :189-198
[8]   SAMPLE CONVERSES IN SOURCE-CODING THEORY [J].
KIEFFER, JC .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1991, 37 (02) :263-268
[9]   MARKOV CHANNELS ARE ASYMPTOTICALLY MEAN STATIONARY [J].
KIEFFER, JC ;
RAHE, M .
SIAM JOURNAL ON MATHEMATICAL ANALYSIS, 1981, 12 (03) :293-305
[10]  
KOGA H, P 1994 IEEE INT S IN, P264