BlastR-fast and accurate database searches for non-coding RNAs

被引:23
作者
Bussotti, Giovanni [1 ,2 ]
Raineri, Emanuele [1 ,2 ,3 ]
Erb, Ionas [1 ,2 ]
Zytnicki, Matthias [1 ,2 ,4 ]
Wilm, Andreas [5 ]
Beaudoing, Emmanuel [1 ,2 ,6 ]
Bucher, Philipp [7 ,8 ]
Notredame, Cedric [1 ,2 ]
机构
[1] CRG, Bioinformat & Genom Program, Barcelona 08003, Spain
[2] UPF, Barcelona 08003, Spain
[3] CNAG Ctr Nacl Anal Genom, E-08028 Barcelona, Spain
[4] URGI INRA Versailles, Dept Plant Breeding & Genet, F-78026 Versailles, France
[5] Univ Coll Dublin, Conway Inst Biomol & Biomed Sci, Dublin 4, Ireland
[6] Univ Lausanne, Ctr Integrat Genom, Genom Technol Facil, CH-1015 Lausanne, Switzerland
[7] Ecole Polytech Fed Lausanne, ISREC, Sch Life Sci, CH-1015 Lausanne, Switzerland
[8] SIB, CH-1015 Lausanne, Switzerland
基金
新加坡国家研究基金会;
关键词
SECONDARY STRUCTURE; SEQUENCE ALIGNMENT; NUCLEOTIDE; IDENTIFICATION; SUBSTITUTION; ALGORITHM; EVOLUTION; HOMOLOGS; ELEMENTS; MODELS;
D O I
10.1093/nar/gkr335
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present and validate BlastR, a method for efficiently and accurately searching non-coding RNAs. Our approach relies on the comparison of di-nucleotides using BlosumR, a new log-odd substitution matrix. In order to use BlosumR for comparison, we recoded RNA sequences into protein-like sequences. We then showed that BlosumR can be used along with the BlastP algorithm in order to search non-coding RNA sequences. Using Rfam as a gold standard, we benchmarked this approach and show BlastR to be more sensitive than BlastN. We also show that BlastR is both faster and more sensitive than BlastP used with a single nucleotide log-odd substitution matrix. BlastR, when used in combination with WU-BlastP, is about 5% more accurate than WU-BlastN and about 50 times slower. The approach shown here is equally effective when combined with the NCBI-Blast package. The software is an open source freeware available from www.tcoffee.org/blastr.html.
引用
收藏
页码:6886 / 6895
页数:10
相关论文
共 41 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   Considerations in the identification of functional RNA structural elements in genomic alignments [J].
Babak, Tomas ;
Blencowe, Benjamin J. ;
Hughes, Timothy R. .
BMC BIOINFORMATICS, 2007, 8 (1)
[3]   Sequence context-specific profiles for homology searching [J].
Biegert, A. ;
Soeding, J. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (10) :3770-3775
[4]   The transcriptional landscape of the mammalian genome [J].
Carninci, P ;
Kasukawa, T ;
Katayama, S ;
Gough, J ;
Frith, MC ;
Maeda, N ;
Oyama, R ;
Ravasi, T ;
Lenhard, B ;
Wells, C ;
Kodzius, R ;
Shimokawa, K ;
Bajic, VB ;
Brenner, SE ;
Batalov, S ;
Forrest, ARR ;
Zavolan, M ;
Davis, MJ ;
Wilming, LG ;
Aidinis, V ;
Allen, JE ;
Ambesi-Impiombato, X ;
Apweiler, R ;
Aturaliya, RN ;
Bailey, TL ;
Bansal, M ;
Baxter, L ;
Beisel, KW ;
Bersano, T ;
Bono, H ;
Chalk, AM ;
Chiu, KP ;
Choudhary, V ;
Christoffels, A ;
Clutterbuck, DR ;
Crowe, ML ;
Dalla, E ;
Dalrymple, BP ;
de Bono, B ;
Della Gatta, G ;
di Bernardo, D ;
Down, T ;
Engstrom, P ;
Fagiolini, M ;
Faulkner, G ;
Fletcher, CF ;
Fukushima, T ;
Furuno, M ;
Futaki, S ;
Gariboldi, M .
SCIENCE, 2005, 309 (5740) :1559-1563
[5]   Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency [J].
Clote, P ;
Ferré, F ;
Kranakis, E ;
Krizanc, D .
RNA, 2005, 11 (05) :578-591
[6]   Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints [J].
D Dowell, Robin ;
Eddy, Sean R. .
BMC BIOINFORMATICS, 2006, 7 (1)
[7]  
Dayhoff M O., 1978, Atlas of Protein Seq Struct, ppp 345
[8]  
DURBIN R, 1998, MODELS PROTEINS NUCL, P72
[9]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[10]   A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure [J].
Eddy, SR .
BMC BIOINFORMATICS, 2002, 3 (1)