Exploring genomic dark matter: A critical assessment of the performance of homology search methods on noncoding RNA

被引:82
作者
Freyhult, Eva K.
Bollback, Jonathan P.
Gardner, Paul P. [1 ]
机构
[1] Univ Copenhagen, Inst Mol Biol & Physiol, Mol Evolut Grp, DK-2100 Copenhagen, Denmark
[2] Univ Copenhagen, Inst Biol, Evolut Dept, DK-2100 Copenhagen, Denmark
[3] Uppsala Univ, Linnaeus Ctr Bioinformat, S-75124 Uppsala, Sweden
关键词
D O I
10.1101/gr.5890907
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Homology search is one of the most ubiquitous bioinformatic tasks, yet it is unknown how effective the currently available tools are for identifying noncoding RNAs (ncRNAs). In this work, we use reliable ncRNA data sets to assess the effectiveness of methods such as BLAST, FASTA, HMMer, and Infernal. Surprisingly, the most popular homology search methods are often the least accurate. As a result, many studies have used inappropriate tools for their analyses. On the basis of our results, we suggest homology search strategies using the currently available tools and some directions for future development.
引用
收藏
页码:117 / 125
页数:9
相关论文
共 56 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
[Anonymous], 1978, Atlas of protein sequence and structure
[3]  
Bollback J., 2005, STAT METHODS MOL EVO, P189
[4]   Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships [J].
Brenner, SE ;
Chothia, C ;
Hubbard, TJP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (11) :6073-6078
[5]   Reconstruction of ancestral protein sequences and its applications [J].
Cai, W ;
Pei, JM ;
Grishin, NV .
BMC EVOLUTIONARY BIOLOGY, 2004, 4 (1)
[6]   The Comparative RNA Web (CRW) Site:: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs -: art. no. 2 [J].
Cannone, JJ ;
Subramanian, S ;
Schnare, MN ;
Collett, JR ;
D'Souza, LM ;
Du, YS ;
Feng, B ;
Lin, N ;
Madabusi, LV ;
Müller, KM ;
Pande, N ;
Shang, ZD ;
Yu, N ;
Gutell, RR .
BMC BIOINFORMATICS, 2002, 3 (1)
[7]  
CHAO KM, 1992, COMPUT APPL BIOSCI, V8, P481
[8]  
COLLINS LJ, 2003, APPL BIOINFORMATICS, V2, P85
[9]  
Durbin R., 1998, Biological sequence analysis: Probabilistic models of proteins and nucleic acids
[10]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763