A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure

被引:154
作者
Eddy, SR [1 ]
机构
[1] Washington Univ, Sch Med, Howard Hughes Med Inst, St Louis, MO 63110 USA
[2] Washington Univ, Sch Med, Dept Genet, St Louis, MO 63110 USA
关键词
D O I
10.1186/1471-2105-3-18
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Covariance models (CMs) are probabilistic models of RNA secondary structure, analogous to profile hidden Markov models of linear sequence. The dynamic programming algorithm for aligning a CM to an RNA sequence of length N is O(N-3) in memory. This is only practical for small RNAs. Results: I describe a divide and conquer variant of the alignment algorithm that is analogous to memory-efficient Myers/Miller dynamic programming algorithms for linear sequence alignment. The new algorithm has an O(N-2 log N) memory complexity, at the expense of a small constant factor in time. Conclusions: Optimal ribosomal RNA structural alignments that previously required up to 150 GB of memory now require less than 270 MB.
引用
收藏
页数:16
相关论文
共 57 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
  • [3] The Ribonuclease P Database
    Brown, JW
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (01) : 314 - 314
  • [4] Brown M. P. S., 2000, Proceedings. Eighth International Conference on Intelligent Systems for Molecular Biology, P57
  • [5] CHIU DKY, 1991, COMPUT APPL BIOSCI, V7, P347
  • [6] CORPET F, 1994, COMPUT APPL BIOSCI, V10, P389
  • [7] FINDING THE HAIRPIN IN THE HAYSTACK - SEARCHING FOR RNA MOTIFS
    DANDEKAR, T
    HENTZE, MW
    [J]. TRENDS IN GENETICS, 1995, 11 (02) : 45 - 50
  • [8] DERIJK P, 1994, NUCLEIC ACIDS RES, V22, P3495
  • [9] Searching for patterns in genomic data
    Dsouza, M
    Larsen, N
    Overbeek, R
    [J]. TRENDS IN GENETICS, 1997, 13 (12) : 497 - 498
  • [10] Durbin R., 1998, BIOL SEQUENCE ANAL P