Accelerated probabilistic inference of RNA structure evolution

被引:78
作者
Holmes I. [1 ]
机构
[1] Department of Bioengineering, University of California, Berkeley
关键词
Dynamic Programming Algorithm; Pairwise Alignment; Parse Tree; Intermediate Probability; Nonterminal Symbol;
D O I
10.1186/1471-2105-6-73
中图分类号
学科分类号
摘要
Background: Pairwise stochastic context-free grammars (Pair SCFGs) are powerful tools for evolutionary analysis of RNA, including simultaneous RNA sequence alignment and secondary structure prediction, but the associated algorithms are intensive in both CPU and memory usage. The same problem is faced by other RNA alignment-and-folding algorithms based on Sankoff's 1985 algorithm. It is therefore desirable to constrain such algorithms, by pre-processing the sequences and using this first pass to limit the range of structures and/or alignments that can be considered. Results: We demonstrate how flexible classes of constraint can be imposed, greatly reducing the computational costs while maintaining a high quality of structural homology prediction. Any score-attributed context-free grammar (e.g. energy-based scoring schemes, or conditionally normalized Pair SCFGs) is amenable to this treatment. It is now possible to combine independent structural and alignment constraints of unprecedented general flexibility in Pair SCFG alignment algorithms. We outline several applications to the bioinformatics of RNA sequence and structure, including Waterman-Eggert N-best alignments and progressive multiple alignment. We evaluate the performance of the algorithm on test examples from the RFAM database. Conclusion: A program, Stemloc, that implements these algorithms for efficient RNA sequence alignment and structure prediction is available under the GNU General Public License. © 2005 Holmes; licensee BioMed Central Ltd.
引用
收藏
页数:22
相关论文
共 39 条
  • [1] Eddy S.R., Noncoding RNA genes, Current Opinion in Genetics and Development, 9, 6, pp. 695-699, (1999)
  • [2] Mandal M., Boese B., Barrick J.E., Winkler W.C., Breaker R.R., Riboswitches Control Fundamental Biochemical Pathways in Bacillus subtilis and Other Bacteria, Cell, 113, pp. 577-586, (2003)
  • [3] Sijen T., Plasterk R.H., Transposon silencing in the Caenorhabditis elegans germ line by natural RNAi, Nature, 426, 6964, pp. 310-314, (2003)
  • [4] Ambros V., The functions of animal microRNAs, Nature, 431, 7006, pp. 350-355, (2004)
  • [5] Baulcombe D., RNA silencing in plants, Nature, 431, 7006, pp. 356-363, (2004)
  • [6] Rivas E., Eddy S.R., Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs, Bioinformatics, 16, 7, pp. 583-605, (2000)
  • [7] Coventry A., Kleitman D.J., Berger B., MSARI: Multiple sequence alignments for statistical detection of RNA secondary structure, Proceedings of the National Academy of Sciences of the USA, 101, pp. 12102-12107, (2004)
  • [8] Knudsen B., Hein J., RNA secondary structure prediction using stochastic context-free grammars and evolutionary history, Bioinformatics, 15, 6, pp. 446-454, (1999)
  • [9] Rivas E., Eddy S.R., Noncoding RNA gene detection using comparative sequence analysis, BMC Bioinformatics, 2, (2001)
  • [10] Gorodkin J., Heyer L.J., Stormo G.D., Finding the most significant common sequence and structure motifs in a set of RNA sequences, Nucleic Acids Research, 25, 18, pp. 3724-3732, (1997)