Improved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts

被引:9
作者
Bille, Philip [1 ]
Fagerberg, Rolf [2 ]
Gortz, Inge Li [3 ]
机构
[1] IT Univ Copenhagen, DK-2300 Copenhagen S, Denmark
[2] Univ So Denmark, DK-5230 Odense M, Denmark
[3] Tech Univ Denmark, DK-2800 Lyngby, Denmark
关键词
Compressed pattern matching; approximate string matching; regular expression matching; Ziv-Lempel; ALGORITHM;
D O I
10.1145/1644015.1644018
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We study the approximate string matching and regular expression matching problem for the case when the text to be searched is compressed with the Ziv-Lempel adaptive dictionary compression schemes. We present a time-space trade-off that leads to algorithms improving the previously known complexities for both problems. In particular, we significantly improve the space bounds, which in practical applications are likely to be a bottleneck.
引用
收藏
页数:14
相关论文
共 27 条
[1]  
Aho Alfred V., 2006, Compilers: Principles, Techniques, and Tools, V2nd
[2]   Let sleeping files lie: Pattern matching in Z-compressed files [J].
Amir, A ;
Benson, G ;
Farach, M .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1996, 52 (02) :299-307
[3]  
AMIR A, 1992, PROCEEDINGS OF THE THIRD ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P440
[4]  
Amir A., 1992, P 2 IEEE DAT COMPR C, P279
[5]  
[Anonymous], IEEE J COMPUTER, DOI 10.1109/MC.1984.1659158
[6]  
BILLE P, 2005, ARXIVORGCS0509069
[7]  
Bille P, 2006, LECT NOTES COMPUT SC, V4051, P643, DOI 10.1007/11786986_56
[8]   Approximate string matching: A simpler faster algorithm [J].
Cole, R ;
Hariharan, R .
SIAM JOURNAL ON COMPUTING, 2002, 31 (06) :1761-1782
[9]   A subquadratic sequence alignment algorithm for unrestricted scoring matrices [J].
Crochemore, M ;
Landau, GM ;
Ziv-Ukelson, M .
SIAM JOURNAL ON COMPUTING, 2003, 32 (06) :1654-1673
[10]   DYNAMIC PERFECT HASHING - UPPER AND LOWER BOUNDS [J].
DIETZFELBINGER, M ;
KARLIN, A ;
MEHLHORN, K ;
AUFDERHEIDE, FM ;
ROHNERT, H ;
TARJAN, RE .
SIAM JOURNAL ON COMPUTING, 1994, 23 (04) :738-761