A suboptimal lossy data compression based on approximate pattern matching

被引:65
作者
Luczak, T [1 ]
Szpankowski, W [1 ]
机构
[1] PURDUE UNIV, DEPT COMP SCI, W LAFAYETTE, IN 47907 USA
基金
美国国家科学基金会;
关键词
approximate pattern matching; generalized Lempel-Ziv scheme; generalized Renyi entropy; lossy data compression; mixing probabilistic model; rate distortion; ASYMPTOTIC PROPERTIES; FIDELITY-CRITERION; UNIVERSAL; ALGORITHM; CONVERGENCE; MISMATCHES; ENTROPY; TREES; RATES; LAW;
D O I
10.1109/18.623143
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A practical suboptinal (variable source coding) algorithm for lossy data compression is presented, This scheme is based on approximate string matching, and it naturally extends the lossless Lempel-Ziv data compression scheme, Among others we consider the typical length of an approximately repeated pattern within the first n positions of a stationary mixing sequence where D percent of mismatches is allowed, We prove that there exists a constant r(0)(D) such that the length of such an approximately repeated pattern converges in probability to 1/r(0)(D) log n (pr.) but it almost surely oscillates between 1/r(-infinity)(D) log n and 2/r(1)(D) log n, where r(-infinity)(D) > r(0)(D) > r(1)(D)/2 are some constants, These constants are natural generalizations of Renyi entropies to the lossy environment. More importantly, we show that the compression ratio of a lossy data compression scheme based on such an approximate pattern matching is asymptotically equal to ro(D), We also establish the asymptotic behavior of the so-called approximate waiting time N-l which is defined as the time until a pattern of length l repeats approximately for the first time. We prove that log N-l/l --> r(0)(D) (pr.) as l --> infinity. In general, r(0)(D) > R(D) where R(D) is the rate-distortion function, Thus for stationary mixing sequences we settle in the negative the problem recently investigated by Steinberg and Gutman by showing that a lossy extension of the Wyner-Ziv scheme cannot be optimal.
引用
收藏
页码:1439 / 1451
页数:13
相关论文
共 39 条
[1]  
Alon Noga, 1992, The Probabilistic Method
[2]  
[Anonymous], 1976, LECT NOTES MATH
[3]   THE ERDOS-RENYI LAW IN DISTRIBUTION, FOR COIN TOSSING AND SEQUENCE MATCHING [J].
ARRATIA, R ;
GORDON, L ;
WATERMAN, MS .
ANNALS OF STATISTICS, 1990, 18 (02) :539-570
[4]   THE ERDOS-RENYI STRONG LAW FOR PATTERN-MATCHING WITH A GIVEN PROPORTION OF MISMATCHES [J].
ARRATIA, R ;
WATERMAN, MS .
ANNALS OF PROBABILITY, 1989, 17 (03) :1152-1169
[5]  
ATALLAH M, 1996, P INT C IM PROC LAUS, V2, P349
[6]  
ATALLAH MJ, 1992, LECT NOTES COMPUT SC, V644, P27
[7]  
Berger T., 1971, RATE DISTORTION THEO
[8]   A vector quantization approach to universal noiseless coding and quantization [J].
Chon, PA ;
Effros, M ;
Gray, RM .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1996, 42 (04) :1109-1138
[9]   IMPROVED TECHNIQUES FOR SINGLE-PASS ADAPTIVE VECTOR QUANTIZATION [J].
CONSTANTINESCU, C ;
STORER, JA .
PROCEEDINGS OF THE IEEE, 1994, 82 (06) :933-939
[10]  
EFFROS M, 1995, INT CONF ACOUST SPEE, P2343, DOI 10.1109/ICASSP.1995.479962