RSEARCH: Finding homologs of single structured RNA sequences

被引:177
作者
Klein, RJ
Eddy, SR [1 ]
机构
[1] Washington Univ, Sch Med, Howard Hughes Med Inst, St Louis, MO 63110 USA
[2] Washington Univ, Sch Med, Dept Genet, St Louis, MO 63110 USA
关键词
D O I
10.1186/1471-2105-4-44
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: For many RNA molecules, secondary structure rather than primary sequence is the evolutionarily conserved feature. No programs have yet been published that allow searching a sequence database for homologs of a single RNA molecule on the basis of secondary structure. Results: We have developed a program, RSEARCH, that takes a single RNA sequence with its secondary structure and utilizes a local alignment algorithm to search a database for homologous RNAs. For this purpose, we have developed a series of base pair and single nucleotide substitution matrices for RNA sequences called RIBOSUM matrices. RSEARCH reports the statistical confidence for each hit as well as the structural alignment of the hit. We show several examples in which RSEARCH outperforms the primary sequence search programs BLAST and SSEARCH. The primary drawback of the program is that it is slow. The C code for RSEARCH is freely available from our lab's website. Conclusion: RSEARCH outperforms primary sequence programs in finding homologs of structured RNA sequences.
引用
收藏
页数:16
相关论文
共 72 条
[1]   AMINO-ACID SUBSTITUTION MATRICES FROM AN INFORMATION THEORETIC PERSPECTIVE [J].
ALTSCHUL, SF .
JOURNAL OF MOLECULAR BIOLOGY, 1991, 219 (03) :555-565
[2]  
Altschul SF, 1996, METHOD ENZYMOL, V266, P460
[3]   The estimation of statistical parameters for local alignment score distributions [J].
Altschul, SF ;
Bundschuh, R ;
Olsen, R ;
Hwa, T .
NUCLEIC ACIDS RESEARCH, 2001, 29 (02) :351-361
[4]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[5]  
[Anonymous], INFERNAL INFERENCE R
[6]   Analysis of the genome sequence of the flowering plant Arabidopsis thaliana [J].
Kaul, S ;
Koo, HL ;
Jenkins, J ;
Rizzo, M ;
Rooney, T ;
Tallon, LJ ;
Feldblyum, T ;
Nierman, W ;
Benito, MI ;
Lin, XY ;
Town, CD ;
Venter, JC ;
Fraser, CM ;
Tabata, S ;
Nakamura, Y ;
Kaneko, T ;
Sato, S ;
Asamizu, E ;
Kato, T ;
Kotani, H ;
Sasamoto, S ;
Ecker, JR ;
Theologis, A ;
Federspiel, NA ;
Palm, CJ ;
Osborne, BI ;
Shinn, P ;
Conway, AB ;
Vysotskaia, VS ;
Dewar, K ;
Conn, L ;
Lenz, CA ;
Kim, CJ ;
Hansen, NF ;
Liu, SX ;
Buehler, E ;
Altafi, H ;
Sakano, H ;
Dunn, P ;
Lam, B ;
Pham, PK ;
Chao, Q ;
Nguyen, M ;
Yu, GX ;
Chen, HM ;
Southwick, A ;
Lee, JM ;
Miranda, M ;
Toriumi, MJ ;
Davis, RW .
NATURE, 2000, 408 (6814) :796-815
[7]   Novel small RNA-encoding genes in the intergenic regions of Escherichia coli [J].
Argaman, L ;
Hershberg, R ;
Vogel, J ;
Bejerano, G ;
Wagner, EGH ;
Margalit, H ;
Altuvia, S .
CURRENT BIOLOGY, 2001, 11 (12) :941-950
[8]   Estimating and evaluating the statistics of gapped local-alignment scores [J].
Bailey, TL ;
Gribskov, M .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (03) :575-593
[9]   Palingol: A declarative programming language to describe nucleic acids' secondary structures and to sequence databases [J].
Billoud, B ;
Kontic, M ;
Viari, A .
NUCLEIC ACIDS RESEARCH, 1996, 24 (08) :1395-1403
[10]   The Ribonuclease P Database [J].
Brown, JW .
NUCLEIC ACIDS RESEARCH, 1999, 27 (01) :314-314