SimulFold:: Simultaneously inferring RNA structures including pseudoknots, alignments, and trees using a Bayesian MCMC framework

被引:53
作者
Meyer, Irmtraud M. [1 ]
Miklos, Istvan
机构
[1] Univ British Columbia, Bioinformat Ctr, Vancouver, BC V5Z 1M9, Canada
[2] Univ British Columbia, Dept Comp Sci, Vancouver, BC V6T 1W5, Canada
[3] Hungarian Acad Sci, Alfred Renyi Inst Math, Budapest, Hungary
[4] Hungarian Acad Sci, Comp & Automat Res Inst, Budapest, Hungary
[5] Eotvos Lorand Univ, Sci Reg Knowledgde Ctr, Budapest, Hungary
关键词
D O I
10.1371/journal.pcbi.0030149
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Computational methods for predicting evolutionarily conserved rather than thermodynamic RNA structures have recently attracted increased interest. These methods are indispensable not only for elucidating the regulatory roles of known RNA transcripts, but also for predicting RNA genes. It has been notoriously difficult to devise them to make the best use of the available data and to predict high- quality RNA structures that may also contain pseudoknots. We introduce a novel theoretical framework for co- estimating an RNA secondary structure including pseudoknots, a multiple sequence alignment, and an evolutionary tree, given several RNA input sequences. We also present an implementation of the framework in a new computer program, called SimulFold, which employs a Bayesian Markov chain Monte Carlo method to sample from the joint posterior distribution of RNA structures, alignments, and trees. We use the new framework to predict RNA structures, and comprehensively evaluate the quality of our predictions by comparing our results to those of several other programs. We also present preliminary data that show SimulFold's potential as an alignment and phylogeny prediction method. SimulFold overcomes many conceptual limitations that current RNA structure prediction methods face, introduces several new theoretical techniques, and generates highquality predictions of conserved RNA structures that may include pseudoknots. It is thus likely to have a strong impact, both on the field of RNA structure prediction and on a wide range of data analyses.
引用
收藏
页码:1441 / 1454
页数:14
相关论文
共 84 条
[1]   Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots [J].
Akutsu, T .
DISCRETE APPLIED MATHEMATICS, 2000, 104 (1-3) :45-62
[2]  
[Anonymous], 1973, THESIS STANFORD U ST
[3]  
[Anonymous], P 23 S INT
[4]   SEQUENTIAL FOLDING OF TRANSFER-RNA - A NUCLEAR MAGNETIC-RESONANCE STUDY OF SUCCESSIVELY LONGER TRANSFER-RNA FRAGMENTS WITH A COMMON 5' END [J].
BOYLE, J ;
ROBILLARD, GT ;
KIM, SH .
JOURNAL OF MOLECULAR BIOLOGY, 1980, 139 (04) :601-625
[5]   Stochastic modeling of RNA pseudoknotted structures: a grammatical approach [J].
Cai, Liming ;
Malmberg, Russell L. ;
Wu, Yunzhou .
BIOINFORMATICS, 2003, 19 :i66-i73
[6]   Multiple sequence alignment with the Clustal series of programs [J].
Chenna, R ;
Sugawara, H ;
Koike, T ;
Lopez, R ;
Gibson, TJ ;
Higgins, DG ;
Thompson, JD .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3497-3500
[7]   Classifying RNA pseudoknotted structures [J].
Condon, A ;
Davy, B ;
Rastegari, B ;
Zhao, S ;
Tarrant, F .
THEORETICAL COMPUTER SCIENCE, 2004, 320 (01) :35-50
[8]  
CORPET F, 1994, COMPUT APPL BIOSCI, V10, P389
[9]   Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints [J].
D Dowell, Robin ;
Eddy, Sean R. .
BMC BIOINFORMATICS, 2006, 7 (1)
[10]   PROPERTIES OF THE NEAREST NEIGHBOR INTERCHANGE METRIC FOR TREES OF SMALL SIZE [J].
DAY, WHE .
JOURNAL OF THEORETICAL BIOLOGY, 1983, 101 (02) :275-288