Improved RNA secondary structure prediction by maximizing expected pair accuracy

被引:149
作者
Lu, Zhi John [1 ]
Gloor, Jason W. [1 ]
Mathews, David H. [1 ,2 ]
机构
[1] Univ Rochester, Med Ctr, Dept Biochem & Biophys, Rochester, NY 14642 USA
[2] Univ Rochester, Med Ctr, Dept Biostat & Computat Biol, Rochester, NY 14642 USA
基金
美国国家卫生研究院;
关键词
RNA secondary structure; free energy minimization; partition function; nearest-neighbor model; NEAREST-NEIGHBOR PARAMETERS; FREE-ENERGY MINIMIZATION; RIBOSOMAL-RNA; THERMODYNAMIC PARAMETERS; PARTITION-FUNCTION; MULTIBRANCH LOOPS; NONCODING RNAS; BASE-PAIRS; ALGORITHM; DATABASE;
D O I
10.1261/rna.1643609
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Free energy minimization has been the most popular method for RNA secondary structure prediction for decades. It is based on a set of empirical free energy change parameters derived from experiments using a nearest-neighbor model. In this study, a program, MaxExpect, that predicts RNA secondary structure by maximizing the expected base-pair accuracy, is reported. This approach was first pioneered in the program CONTRAfold, using pair probabilities predicted with a statistical learning method. Here, a partition function calculation that utilizes the free energy change nearest-neighbor parameters is used to predict base-pair probabilities as well as probabilities of nucleotides being single-stranded. MaxExpect predicts both the optimal structure (having highest expected pair accuracy) and suboptimal structures to serve as alternative hypotheses for the structure. Tested on a large database of different types of RNA, the maximum expected accuracy structures are, on average, of higher accuracy than minimum free energy structures. Accuracy is measured by sensitivity, the percentage of known base pairs correctly predicted, and positive predictive value (PPV), the percentage of predicted pairs that are in the known structure. By favoring double-strandedness or single-strandedness, a higher sensitivity or PPV of prediction can be favored, respectively. Using MaxExpect, the average PPV of optimal structure is improved from 66% to 68% at the same sensitivity level (73%) compared with free energy minimization.
引用
收藏
页码:1805 / 1813
页数:9
相关论文
共 48 条
[1]   Efficient parameter estimation for RNA secondary structure prediction [J].
Andronescu, Mirela ;
Condon, Anne ;
Hoos, Holger H. ;
Mathews, David H. ;
Murphy, Kevin P. .
BIOINFORMATICS, 2007, 23 (13) :I19-I28
[2]   Non-nearest-neighbor dependence of the stability for RNA bulge loops based on the complete set of group I single-nucleotide bulge loops [J].
Blose, Joshua M. ;
Manni, Michelle L. ;
Klapec, Kelly A. ;
Stranger-Jones, Yukiko ;
Zyra, Allison C. ;
Sim, Vasiliy ;
Griffith, Chad A. ;
Long, Jason D. ;
Serra, Martin J. .
BIOCHEMISTRY, 2007, 46 (51) :15123-15135
[3]   Variations on RNA folding and alignment:: lessons from Benasque [J].
Bompfuenewerer, Athanasius F. ;
Backofen, Rolf ;
Bernhart, Stephan H. ;
Hertel, Jana ;
Hofacker, Ivo L. ;
Stadler, Peter F. ;
Will, Sebastian .
JOURNAL OF MATHEMATICAL BIOLOGY, 2008, 56 (1-2) :129-144
[4]   The Ribonuclease P Database [J].
Brown, JW .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :351-352
[5]   Solution structure of an RNA internal loop with three consecutive sheared GA pair [J].
Chen, G ;
Znosko, BM ;
Kennedy, SD ;
Krugh, TR ;
Turner, DH .
BIOCHEMISTRY, 2005, 44 (08) :2845-2856
[6]   3′ Terminal Nucleotides Determine Thermodynamic Stabilities of Mismatches at the Ends of RNA Helices [J].
Clanton-Arrowood, Koree ;
McGurk, John ;
Schroeder, Susan J. .
BIOCHEMISTRY, 2008, 47 (50) :13418-13427
[7]   A COMPARATIVE DATABASE OF GROUP INTRON STRUCTURES [J].
DAMBERGER, SH ;
GUTELL, RR .
NUCLEIC ACIDS RESEARCH, 1994, 22 (17) :3508-3510
[8]   Thermodynamic characterization of naturally occurring RNA single mismatches with G-U nearest neighbors [J].
Davis, Amber R. ;
Znosko, Brent M. .
BIOCHEMISTRY, 2008, 47 (38) :10178-10187
[9]   Thermodynamics of three-way multibranch loops in RNA [J].
Diamond, JM ;
Turner, DH ;
Mathews, DH .
BIOCHEMISTRY, 2001, 40 (23) :6971-6981
[10]   RNA secondary structure prediction by centroids in a Boltzmann weighted ensemble [J].
Ding, Y ;
Chan, CY ;
Lawrence, CE .
RNA, 2005, 11 (08) :1157-1166