Sparse RNA folding: Time and space efficient algorithms

被引:36
作者
Backofen, Rolf [1 ]
Tsur, Dekel [2 ]
Zakov, Shay [2 ]
Ziv-Ukelson, Michal [2 ]
机构
[1] Albert Ludwigs Univ, Freiburg, Germany
[2] Ben Gurion Univ Negev, Dept Comp Sci, Beer Sheva, Israel
关键词
RNA folding; Time complexity; Space complexity; Dynamic programming; Sparsification;
D O I
10.1016/j.jda.2010.09.001
中图分类号
O29 [应用数学];
学科分类号
070104 [应用数学];
摘要
The currently fastest algorithm for RNA Single Strand Folding requires O(nZ) time and Theta(n(2)) space, where n denotes the length of the input string and Z is a sparsity parameter satisfying n <= Z < n(2). We show how to reduce the time and space complexities of this algorithm in the sparse case. The space reduction is based on the observation that some solutions for sub-instances are not examined after a certain stage of the algorithm, and may be discarded from memory. The running time speed up is achieved by combining two independent sparsification criteria, which restrict the number of expressions that need to be examined in bottleneck computations of the algorithm. This yields an O(n(2) + P Z) time and Theta(Z) space algorithm, where P is a sparsity parameter satisfying P < n <= Z <= n(P + 1). For the base-pairing maximization variant, the time complexity is further reduced to O(LZ), where L denotes the maximum number of base-pairs in a folding of the input string and satisfies L <= n\2. The presented techniques also extend to the related RNA Simultaneous Alignment and Folding problem. For an input composed of two strings of lengths n and m, the time and space complexities are reduced from O(nm (Z) over tilde) and Theta(n(2)m(2)) down to O(n(2)m(2) + (P) over tilde(Z) over tilde) and Theta(nm(2) + (Z) over tilde) respectively, where (Z) over tilde and (P) over tilde are sparsity parameters satisfying (P) over tilde < nm <= (Z) over tilde <nm(<(P)over tilde> + 3). A preliminary extended abstract of this work previously appeared in Backofen et al. (2009) [5]. Code implementations (in Java) may be downloaded from: http://www.cs.bgu.ac.il/similar to zakovs/RNAfold/SparseFold. zip. (C) 2010 Elsevier B. V. All rights reserved.
引用
收藏
页码:12 / 31
页数:20
相关论文
共 41 条
[2]
RNA-RNA interaction prediction and antisense RNA target search [J].
Alkan, C ;
Karakoç, E ;
Nadeau, JH ;
Sahinalp, SC ;
Zhang, KH .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2006, 13 (02) :267-282
[3]
Efficient parameter estimation for RNA secondary structure prediction [J].
Andronescu, Mirela ;
Condon, Anne ;
Hoos, Holger H. ;
Mathews, David H. ;
Murphy, Kevin P. .
BIOINFORMATICS, 2007, 23 (13) :I19-I28
[4]
NEW CLIQUE AND INDEPENDENT SET ALGORITHMS FOR CIRCLE GRAPHS [J].
APOSTOLICO, A ;
ATALLAH, MJ ;
HAMBRUSCH, SE .
DISCRETE APPLIED MATHEMATICS, 1992, 36 (01) :1-24
[5]
RNAs everywhere:: Genome-wide annotation of structured RNAs [J].
Backofen, Rolf ;
Bernhart, Stephan H. ;
Flamm, Christoph ;
Fried, Claudia ;
Fritzsch, Guido ;
Hackermueller, Joerg ;
Hertel, Jana ;
Hofacker, Ivo L. ;
Missal, Kristin ;
Mosig, Axel ;
Prohaska, Sonja J. ;
Rose, Dominic ;
Stadler, Peter F. ;
Tanzer, Andrea ;
Washietl, Stefan ;
Will, Sebastian .
JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION, 2007, 308B (01) :1-25
[6]
Backofen R, 2009, LECT NOTES COMPUT SC, V5577, P249, DOI 10.1007/978-3-642-02441-2_22
[7]
TRAINABLE GRAMMARS FOR SPEECH RECOGNITION [J].
BAKER, JK .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1979, 65 :S132-S132
[8]
MORE ALGORITHMS FOR ALL-PAIRS SHORTEST PATHS IN WEIGHTED GRAPHS [J].
Chan, Timothy M. .
SIAM JOURNAL ON COMPUTING, 2010, 39 (05) :2075-2089
[9]
A partition function algorithm for interacting nucleic acid strands [J].
Chitsaz, Hamidreza ;
Salari, Raheleh ;
Sahinalp, S. Cenk ;
Backofen, Rolf .
BIOINFORMATICS, 2009, 25 (12) :I365-I373
[10]
Cocke John, 1970, PROGRAMMING LANGUAGE