A statistical sampling algorithm for RNA secondary structure prediction

被引:386
作者
Ding, Y [1 ]
Lawrence, CE [1 ]
机构
[1] New York State Dept Hlth, Wadsworth Ctr Labs & Res, Bioinformat Ctr, Albany, NY 12208 USA
关键词
D O I
10.1093/nar/gkg938
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
An RNA molecule, particularly a long-chain mRNA, may exist as a population of structures. Furthermore, multiple structures have been demonstrated to play important functional roles. Thus, a representation of the ensemble of probable structures is of interest. We present a statistical algorithm to sample rigorously and exactly from the Boltzmann ensemble of secondary structures. The forward step of the algorithm computes the equilibrium partition functions of RNA secondary structures with recent thermodynamic parameters. Using conditional probabilities computed with the partition functions in a recursive sampling process, the backward step of the algorithm quickly generates a statistically representative sample of structures. With cubic run time for the forward step, quadratic run time in the worst case for the sampling step, and quadratic storage, the algorithm is efficient for broad applicability. We demonstrate that, by classifying sampled structures, the algorithm enables a statistical delineation and representation of the Boltzmann ensemble. Applications of the algorithm show that alternative biological structures are revealed through sampling. Statistical sampling provides a means to estimate the probability of any structural motif, with or without constraints. For example, the algorithm enables probability profiling of single-stranded regions in RNA secondary structure. Probability profiling for specific loop types is also illustrated. By overlaying probability profiles, a mutual accessibility plot can be displayed for predicting RNA:RNA interactions. Boltzmann probability-weighted density of states and free energy distributions of sampled structures can be readily computed. We show that a sample of moderate size from the ensemble of an enormous number of possible structures is sufficient to guarantee statistical reproducibility in the estimates of typical sampling statistics. Our applications suggest that the sampling algorithm may be well suited to prediction of mRNA structure and target accessibility. The algorithm is applicable to the rational design of small interfering RNAs (siRNAs), antisense oligonucleotides, and trans-cleaving ribozymes in gene knock-down studies.
引用
收藏
页码:7280 / 7301
页数:22
相关论文
共 34 条
[1]   ALTERNATIVE MESSENGER-RNA STRUCTURES OF THE CIII-GENE OF BACTERIOPHAGE-LAMBDA DETERMINE THE RATE OF ITS TRANSLATION INITIATION [J].
ALTUVIA, S ;
KORNITZER, D ;
TEFF, D ;
OPPENHEIM, AB .
JOURNAL OF MOLECULAR BIOLOGY, 1989, 210 (02) :265-280
[2]   The efficacy of small interfering RNAs targeted to the type 1 insulin-like growth factor receptor (IGF1R) is influenced by secondary structure in the IGF1R transcript [J].
Bohula, EA ;
Salisbury, AJ ;
Sohail, M ;
Playford, MP ;
Riedemann, J ;
Southern, EM ;
Macaulay, VM .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2003, 278 (18) :15991-15997
[3]  
BONHOEFFER S, 1993, EUR BIOPHYS J BIOPHY, V22, P13, DOI 10.1007/BF00205808
[4]  
CHRISTOFFERSEN RE, 1994, J MOL STRUC-THEOCHEM, V117, P273, DOI 10.1016/S0166-1280(09)80065-1
[5]  
Cupal J, 1997, ISMB-97 - FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY, PROCEEDINGS, P88
[6]  
Ding Y, 2002, STAT SINICA, V12, P273
[7]   Statistical prediction of single-stranded regions in RNA secondary structure and application to predicting effective antisense target sites and beyond [J].
Ding, Y ;
Lawrence, CE .
NUCLEIC ACIDS RESEARCH, 2001, 29 (05) :1034-1046
[8]   A Bayesian statistical algorithm for RNA secondary structure prediction [J].
Ding, Y ;
Lawrence, CE .
COMPUTERS & CHEMISTRY, 1999, 23 (3-4) :387-400
[9]   The activity of siRNA in mammalian cells is related to structural target accessibility: a comparison with antisense oligonucleotides [J].
Far, RKK ;
Sczakiel, G .
NUCLEIC ACIDS RESEARCH, 2003, 31 (15) :4417-4424
[10]  
Fleiss JL, 1981, STAT METHODS RATES P