BatMis: a fast algorithm for k-mismatch mapping

被引:25
作者
Tennakoon, Chandana [1 ,2 ]
Purbojati, Rikky W. [2 ]
Sung, Wing-Kin [1 ,2 ]
机构
[1] CeLS, NUS Grad Sch Integrat Sci & Engn, Singapore 117456, Singapore
[2] Natl Univ Singapore, Sch Comp, Computat Biol Lab, Singapore 119077, Singapore
关键词
LOCAL ALIGNMENT;
D O I
10.1093/bioinformatics/bts339
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Second-generation sequencing (SGS) generates millions of reads that need to be aligned to a reference genome allowing errors. Although current aligners can efficiently map reads allowing a small number of mismatches, they are not well suited for handling a large number of mismatches. The efficiency of aligners can be improved using various heuristics, but the sensitivity and accuracy of the alignments are sacrificed. In this article, we introduce Basic Alignment tool for Mismatches (BatMis)-an efficient method to align short reads to a reference allowing k mismatches. BatMis is a Burrows-Wheeler transformation based aligner that uses a seed and extend approach, and it is an exact method. Results: Benchmark tests show that BatMis performs better than competing aligners in solving the k-mismatch problem. Furthermore, it can compete favorably even when compared with the heuristic modes of the other aligners. BatMis is a useful alternative for applications where fast k-mismatch mappings, unique mappings or multiple mappings of SGS data are required.
引用
收藏
页码:2122 / 2128
页数:7
相关论文
共 18 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
Burrows M., 1994, VELEST USERS GUIDE S
[3]  
Cox A., 2006, ELAND EFFICIENT LOCA
[4]   Infidelity of SARS-CoV Nsp14-Exonuclease Mutant Virus Replication Is Revealed by Complete Genome Sequencing [J].
Eckerle, Lance D. ;
Becker, Michelle M. ;
Halpin, Rebecca A. ;
Li, Kelvin ;
Venter, Eli ;
Lu, Xiaotao ;
Scherbakova, Sana ;
Graham, Rachel L. ;
Baric, Ralph S. ;
Stockwell, Timothy B. ;
Spiro, David J. ;
Denison, Mark R. .
PLOS PATHOGENS, 2010, 6 (05) :1-15
[5]  
Ferragina P., 2000, VELEST USERS GUIDE S
[6]   Whole-genome sequencing and variant discovery in C-elegans [J].
Hillier, LaDeana W. ;
Marth, Gabor T. ;
Quinlan, Aaron R. ;
Dooling, David ;
Fewell, Ginger ;
Barnett, Derek ;
Fox, Paul ;
Glasscock, Jarret I. ;
Hickenbotham, Matthew ;
Huang, Weichun ;
Magrini, Vincent J. ;
Richt, Ryan J. ;
Sander, Sacha N. ;
Stewart, Donald A. ;
Stromberg, Michael ;
Tsung, Eric F. ;
Wylie, Todd ;
Schedl, Tim ;
Wilson, Richard K. ;
Mardis, Elaine R. .
NATURE METHODS, 2008, 5 (02) :183-188
[7]  
Hon W.K., 2004, ALENEX ANALC
[8]   Compressed indexing and local alignment of DNA [J].
Lam, T. W. ;
Sung, W. K. ;
Tam, S. L. ;
Wong, C. K. ;
Yiu, S. M. .
BIOINFORMATICS, 2008, 24 (06) :791-797
[9]   Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J].
Langmead, Ben ;
Trapnell, Cole ;
Pop, Mihai ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2009, 10 (03)
[10]   Mapping short DNA sequencing reads and calling variants using mapping quality scores [J].
Li, Heng ;
Ruan, Jue ;
Durbin, Richard .
GENOME RESEARCH, 2008, 18 (11) :1851-1858