Efficient parameter estimation for RNA secondary structure prediction

被引:143
作者
Andronescu, Mirela [1 ]
Condon, Anne
Hoos, Holger H.
Mathews, David H.
Murphy, Kevin P.
机构
[1] Univ British Columbia, Dept Comp Sci, Vancouver, BC V6T 1Z4, Canada
[2] Univ Rochester, Med Ctr, Dept Biochem & Biophys, Rochester, NY 14642 USA
[3] Univ Rochester, Med Ctr, Dept Biostat & Computat Biol, Rochester, NY 14642 USA
关键词
D O I
10.1093/bioinformatics/btm223
中图分类号
Q5 [生物化学];
学科分类号
071010 [生物化学与分子生物学]; 081704 [应用化学];
摘要
Motivation: Accurate prediction of RNA secondary structure from the base sequence is an unsolved computational challenge. The accuracy of predictions made by free energy minimization is limited by the quality of the energy parameters in the underlying free energy model. The most widely used model, the Turner99 model, has hundreds of parameters, and so a robust parameter estimation scheme should efficiently handle large data sets with thousands of structures. Moreover, the estimation scheme should also be trained using available experimental free energy data in addition to structural data. Results: In this work, we present constraint generation (CG), the first computational approach to RNA free energy parameter estimation that can be efficiently trained on large sets of structural as well as thermodynamic data. Our CG approach employs a novel iterative scheme, whereby the energy values are first computed as the solution to a constrained optimization problem. Then the newly computed energy parameters are used to update the constraints on the optimization function, so as to better optimize the energy parameters in the next iteration. Using our method on biologically sound data, we obtain revised parameters for the Turner99 energy model. We show that by using our new parameters, we obtain significant improvements in prediction accuracy over current state of-the-art methods.
引用
收藏
页码:I19 / I28
页数:10
相关论文
共 21 条
[1]
ANDRONESCU M, 2003, THESIS U BRIT COLUMB
[2]
An autonomous molecular computer for logical control of gene expression [J].
Benenson, Y ;
Gil, B ;
Ben-Dor, U ;
Adar, R ;
Shapiro, E .
NATURE, 2004, 429 (6990) :423-429
[4]
The Comparative RNA Web (CRW) Site:: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs:: Correction (vol 3, pg 2, 2002) -: art. no. 15 [J].
Cannone, JJ ;
Subramanian, S ;
Schnare, MN ;
Collett, JR ;
D'Souza, LM ;
Du, YS ;
Feng, B ;
Lin, N ;
Madabusi, LV ;
Müller, KM ;
Pande, N ;
Shang, ZD ;
Yu, N ;
Gutell, RR .
BMC BIOINFORMATICS, 2002, 3 (1)
[5]
Triggered amplification by hybridization chain reaction [J].
Dirks, RM ;
Pierce, NA .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (43) :15275-15278
[6]
CONTRAfold: RNA secondary structure prediction without physics-based models [J].
Do, Chuong B. ;
Woods, Daniel A. ;
Batzoglou, Serafim .
BIOINFORMATICS, 2006, 22 (14) :E90-E98
[7]
FAST FOLDING AND COMPARISON OF RNA SECONDARY STRUCTURES [J].
HOFACKER, IL ;
FONTANA, W ;
STADLER, PF ;
BONHOEFFER, LS ;
TACKER, M ;
SCHUSTER, P .
MONATSHEFTE FUR CHEMIE, 1994, 125 (02) :167-188
[8]
Lafferty J., 2001, PROC 18 INT C MACHIN, DOI [DOI 10.1038/NPROT.2006.61, 10.1038/nprot.2006.61]
[9]
Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure [J].
Mathews, DH ;
Disney, MD ;
Childs, JL ;
Schroeder, SJ ;
Zuker, M ;
Turner, DH .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (19) :7287-7292