HiTEC: accurate error correction in high-throughput sequencing data

被引:89
作者
Ilie, Lucian [1 ]
Fazayeli, Farideh [1 ]
Ilie, Silvana [2 ]
机构
[1] Univ Western Ontario, Dept Comp Sci, London, ON N6A 5B7, Canada
[2] Ryerson Univ, Dept Math, Toronto, ON M5B 2K3, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
SHORT DNA-SEQUENCES; READS; ALIGNMENT; GENOME; OLIGONUCLEOTIDES; TECHNOLOGY; ALGORITHM; MILLIONS; PROGRAM;
D O I
10.1093/bioinformatics/btq653
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: High-throughput sequencing technologies produce very large amounts of data and sequencing errors constitute one of the major problems in analyzing such data. Current algorithms for correcting these errors are not very accurate and do not automatically adapt to the given data. Results: We present HiTEC, an algorithm that provides a highly accurate, robust and fully automated method to correct reads produced by high-throughput sequencing methods. Our approach provides significantly higher accuracy than previous methods. It is time and space efficient and works very well for all read lengths, genome sizes and coverage levels.
引用
收藏
页码:295 / 302
页数:8
相关论文
共 36 条
[11]  
Kärkkäinen J, 2003, LECT NOTES COMPUT SC, V2719, P943
[12]  
Kasai T., 2001, Combinatorial Pattern Matching. 12th Annual Symposium, CPM 2001. Proceedings (Lecture Notes in Computer Science Vol. 2089), P181
[13]   Constructing suffix arrays in linear time [J].
Kim, Dong Kyue ;
Sim, Jeong Seop ;
Park, Heejin ;
Park, Kunsoo .
JOURNAL OF DISCRETE ALGORITHMS, 2005, 3 (2-4) :126-142
[14]   ProbeMatch: rapid alignment of oligonucleotides to genome allowing both gaps and mismatches [J].
Kim, You Jung ;
Teletia, Nikhil ;
Ruotti, Victor ;
Maher, Christopher A. ;
Chinnaiyan, Arul M. ;
Stewart, Ron ;
Thomson, James A. ;
Patel, Jignesh M. .
BIOINFORMATICS, 2009, 25 (11) :1424-1425
[15]   Space efficient linear time construction of suffix arrays [J].
Ko, Pang ;
Aluru, Srinivas .
JOURNAL OF DISCRETE ALGORITHMS, 2005, 3 (2-4) :143-156
[16]   Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J].
Langmead, Ben ;
Trapnell, Cole ;
Pop, Mihai ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2009, 10 (03)
[17]   Mapping short DNA sequencing reads and calling variants using mapping quality scores [J].
Li, Heng ;
Ruan, Jue ;
Durbin, Richard .
GENOME RESEARCH, 2008, 18 (11) :1851-1858
[18]   Fast and accurate short read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2009, 25 (14) :1754-1760
[19]   SOAP: short oligonucleotide alignment program [J].
Li, Ruiqiang ;
Li, Yingrui ;
Kristiansen, Karsten ;
Wang, Jun .
BIOINFORMATICS, 2008, 24 (05) :713-714
[20]   ZOOM! Zillions of oligos mapped [J].
Lin, Hao ;
Zhang, Zefeng ;
Zhang, Michael Q. ;
Ma, Bin ;
Li, Ming .
BIOINFORMATICS, 2008, 24 (21) :2431-2437