Consensus folding of unaligned RNA sequences revisited

被引:22
作者
Bafna, V
Tang, HX
Zhang, SJ [1 ]
机构
[1] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
[2] Indiana Univ, Sch Informat, Bloomington, IN 47408 USA
[3] Indiana Univ, Ctr Gen & Bioinformat, Bloomington, IN 47408 USA
关键词
RNA secondary structure prediction; RNA consensus folding; RNA stack configuration; dynamic programming;
D O I
10.1089/cmb.2006.13.283
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
As one of the earliest problems in computational biology, RNA secondary structure prediction (sometimes referred to as "RNA folding") problem has attracted attention again, thanks to the recent discoveries of many novel non-coding RNA molecules. The two common approaches to this problem are de novo prediction of RNA secondary structure based on energy minimization and the consensus folding approach (computing the common secondary structure for a set of unaligned RNA sequences). Consensus folding algorithms work well when the correct seed alignment is part of the input to the problem. However, seed alignment itself is a challenging problem for diverged RNA families. In this paper, we propose a novel framework to predict the common secondary structure for unaligned RNA sequences. By matching putative stacks in RNA sequences, we make use of both primary sequence information and thermodynamic stability for prediction at the same time. We show that our method can predict the correct common RNA secondary structures even when we are given only a limited number of unaligned RNA sequences, and it outperforms current algorithms in sensitivity and accuracy.
引用
收藏
页码:283 / 295
页数:13
相关论文
共 45 条
[1]   Novel small RNA-encoding genes in the intergenic regions of Escherichia coli [J].
Argaman, L ;
Hershberg, R ;
Vogel, J ;
Bejerano, G ;
Wagner, EGH ;
Margalit, H ;
Altuvia, S .
CURRENT BIOLOGY, 2001, 11 (12) :941-950
[2]  
Bafna V, 1995, LECT NOTES COMPUT SC, V937, P1
[3]   A new method to predict the consensus secondary structure of a set of unaligned RNA sequences [J].
Bouthinon, D ;
Soldano, H .
BIOINFORMATICS, 1999, 15 (10) :785-798
[4]   MAVID: Constrained ancestral alignment of multiple sequences [J].
Bray, N ;
Pachter, L .
GENOME RESEARCH, 2004, 14 (04) :693-699
[5]   Automated whole-genome multiple alignment of rat, mouse, and human [J].
Brudno, M ;
Poliakov, A ;
Salamov, A ;
Cooper, GM ;
Sidow, A ;
Rubin, EM ;
Solovyev, V ;
Batzoglou, S ;
Dubchak, I .
GENOME RESEARCH, 2004, 14 (04) :685-692
[6]   Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs [J].
Cawley, S ;
Bekiranov, S ;
Ng, HH ;
Kapranov, P ;
Sekinger, EA ;
Kampa, D ;
Piccolboni, A ;
Sementchenko, V ;
Cheng, J ;
Williams, AJ ;
Wheeler, R ;
Wong, B ;
Drenkow, J ;
Yamanaka, M ;
Patel, S ;
Brubaker, S ;
Tammana, H ;
Helt, G ;
Struhl, K ;
Gingeras, TR .
CELL, 2004, 116 (04) :499-509
[7]   Finishing the euchromatic sequence of the human genome [J].
Collins, FS ;
Lander, ES ;
Rogers, J ;
Waterston, RH .
NATURE, 2004, 431 (7011) :931-945
[8]  
Davydov E, 2004, LECT NOTES COMPUT SC, V3109, P254
[9]   RNA SEQUENCE-ANALYSIS USING COVARIANCE-MODELS [J].
EDDY, SR ;
DURBIN, R .
NUCLEIC ACIDS RESEARCH, 1994, 22 (11) :2079-2088
[10]   Non-coding RNA genes and the modern RNA world [J].
Eddy, SR .
NATURE REVIEWS GENETICS, 2001, 2 (12) :919-929