Correcting errors in short reads by multiple alignments

被引：119

作者：

Salmela, Leena ^{[1
]}

Schroeder, Jan ^{[2
]}

机构：

[1] Univ Helsinki, Dept Comp Sci, Helsinki Inst Informat Technol HIT, SF-00510 Helsinki, Finland

[2] Univ Melbourne, Dept Comp Sci & Software Engn, NICTA Victorian Res Lab, Melbourne, Vic, Australia

来源：

BIOINFORMATICS | 2011年 / 27卷 / 11期

基金：

澳大利亚研究理事会; 芬兰科学院;

关键词：

SEQUENCE;

D O I：

10.1093/bioinformatics/btr170

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: Current sequencing technologies produce a large number of erroneous reads. The sequencing errors present a major challenge in utilizing the data in de novo sequencing projects as assemblers have difficulties in dealing with errors. Results: We present Coral which corrects sequencing errors by forming multiple alignments. Unlike previous tools for error correction, Coral can utilize also bases distant from the error in the correction process because the whole read is present in the alignment. Coral is easily adjustable to reads produced by different sequencing technologies like Illumina Genome Analyzer and Roche/454 Life Sciences sequencing platforms because the sequencing error model can be defined by the user. We show that our method is able to reduce the error rate of reads more than previous methods.

引用

页码：1455 / 1461

页数：7

共 20 条

[1] Fragment assembly with short reads
Chaisson, M
Pevzner, P
Tang, HX
[J]. BIOINFORMATICS, 2004, 20 (13) : 2067 - 2074
[2] Short read fragment assembly of bacterial genomes
Chaisson, Mark J.
Pevzner, Pavel A.
[J]. GENOME RESEARCH, 2008, 18 (02) : 324 - 330
[3] De novo fragment assembly with short mate-paired reads: Does the read length matter?
Chaisson, Mark J.
Brinza, Dumitru
Pevzner, Pavel A.
[J]. GENOME RESEARCH, 2009, 19 (02) : 336 - 346
[4] Substantial biases in ultra-short read data sets from high-throughput DNA sequencing
Dohm, Juliane C.
Lottaz, Claudio
Borodina, Tatiana
Himmelbauer, Heinz
[J]. NUCLEIC ACIDS RESEARCH, 2008, 36 (16)
[5] Single-molecule DNA sequencing technologies for future genomics research
Gupta, Pushpendra K.
[J]. TRENDS IN BIOTECHNOLOGY, 2008, 26 (11) : 602 - 611
[6] De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer
Hernandez, David
Francois, Patrice
Farinelli, Laurent
Osteras, Magne
Schrenzel, Jacques
[J]. GENOME RESEARCH, 2008, 18 (05) : 802 - 809
[7] HiTEC: accurate error correction in high-throughput sequencing data
Ilie, Lucian
Fazayeli, Farideh
Ilie, Silvana
[J]. BIOINFORMATICS, 2011, 27 (03) : 295 - 302
[8] Whole-genome sequence assembly for mammalian genomes: Arachne 2
Jaffe, DB
Butler, J
Gnerre, S
Mauceli, E
Lindblad-Toh, K
Mesirov, JP
Zody, MC
Lander, ES
[J]. GENOME RESEARCH, 2003, 13 (01) : 91 - 96
[9] Quake: quality-aware detection and correction of sequencing errors
Kelley, David R.
Schatz, Michael C.
Salzberg, Steven L.
[J]. GENOME BIOLOGY, 2010, 11 (11):
[10] SOAP: short oligonucleotide alignment program
Li, Ruiqiang
Li, Yingrui
Kristiansen, Karsten
Wang, Jun
[J]. BIOINFORMATICS, 2008, 24 (05) : 713 - 714

← 1 2 →