Evaluation of Alignment Algorithms for Discovery and Identification of Pathogens Using RNA-Seq

被引:29
作者
Borozan, Ivan [1 ]
Watt, Stuart N. [1 ]
Ferretti, Vincent [1 ]
机构
[1] Ontario Inst Canc Res, Toronto, ON, Canada
来源
PLOS ONE | 2013年 / 8卷 / 10期
关键词
READ ALIGNMENT; SEQUENCES; ULTRAFAST; IDENTIFY; SOFTWARE; VIRUSES; SEARCH;
D O I
10.1371/journal.pone.0076935
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Next-generation sequencing technologies provide an unparallelled opportunity for the characterization and discovery of known and novel viruses. Because viruses are known to have the highest mutation rates when compared to eukaryotic and bacterial organisms, we assess the extent to which eleven well-known alignment algorithms (BLAST, BLAT, BWA, BWA-SW, BWA-MEM, BFAST, Bowtie2, Novoalign, GSNAP, SHRiMP2 and STAR) can be used for characterizing mutated and non-mutated viral sequences - including those that exhibit RNA splicing - in transcriptome samples. To evaluate aligners objectively we developed a realistic RNA-Seq simulation and evaluation framework (RiSER) and propose a new combined score to rank aligners for viral characterization in terms of their precision, sensitivity and alignment accuracy. We used RiSER to simulate both human and viral read sequences and suggest the best set of aligners for viral sequence characterization in human transcriptome samples. Our results show that significant and substantial differences exist between aligners and that a digital-subtraction-based viral identification framework can and should use different aligners for different parts of the process. We determine the extent to which mutated viral sequences can be effectively characterized and show that more sensitive aligners such as BLAST, BFAST, SHRiMP2, BWA-SW and GSNAP can accurately characterize substantially divergent viral sequences with up to 15% overall sequence mutation rate. We believe that the results presented here will be useful to researchers choosing aligners for viral sequence characterization using next-generation sequencing data.
引用
收藏
页数:17
相关论文
共 36 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
Benson DA, 2013, NUCLEIC ACIDS RES, V41, pD36, DOI [10.1093/nar/gkn723, 10.1093/nar/gkp1024, 10.1093/nar/gkw1070, 10.1093/nar/gkr1202, 10.1093/nar/gkx1094, 10.1093/nar/gkl986, 10.1093/nar/gkq1079, 10.1093/nar/gks1195, 10.1093/nar/gkg057]
[3]   Rapid identification of non-human sequences in high-throughput sequencing datasets [J].
Bhaduri, Aparna ;
Qu, Kun ;
Lee, Carolyn S. ;
Ungewickell, Alexander ;
Khavari, Paul A. .
BIOINFORMATICS, 2012, 28 (08) :1174-1175
[4]   CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes [J].
Borozan, Ivan ;
Wilson, Shane ;
Blanchette, Paola ;
Laflamme, Philippe ;
Watt, Stuart N. ;
Krzyzanowski, Paul M. ;
Sircoulomb, Fabrice ;
Rottapel, Robert ;
Branton, Philip E. ;
Ferretti, Vincent .
BMC BIOINFORMATICS, 2012, 13
[5]   VirusSeq: software to identify viruses and their integration sites using next-generation sequencing of human cancer tissue [J].
Chen, Yunxin ;
Yao, Hui ;
Thompson, Erika J. ;
Tannir, Nizar M. ;
Weinstein, John N. ;
Su, Xiaoping .
BIOINFORMATICS, 2013, 29 (02) :266-267
[6]   Metagenomic analysis of coastal RNA virus communities [J].
Culley, Alexander I. ;
Lang, Andrew S. ;
Suttle, Curtis A. .
SCIENCE, 2006, 312 (5781) :1795-1798
[7]   SHRiMP2: Sensitive yet Practical Short Read Mapping [J].
David, Matei ;
Dzamba, Misko ;
Lister, Dan ;
Ilie, Lucian ;
Brudno, Michael .
BIOINFORMATICS, 2011, 27 (07) :1011-1012
[8]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[9]  
Drake JW, 1998, GENETICS, V148, P1667
[10]   Human transcriptome subtraction by using short sequence tags to search for tumor viruses in conjunctival carcinoma [J].
Feng, Huichen ;
Taylor, Jennifer L. ;
Benos, Panayiotis V. ;
Newton, Robert ;
Waddell, Keith ;
Lucas, Sebastien B. ;
Chang, Yuan ;
Moore, Patrick S. .
JOURNAL OF VIROLOGY, 2007, 81 (20) :11332-11340