TagDust-a program to eliminate artifacts from next generation sequencing data

被引:183
作者
Lassmann, Timo [1 ]
Hayashizaki, Yoshihide [1 ]
Daub, Carsten O. [1 ]
机构
[1] Riken Yokohama Inst, Om Sci Ctr, Tsurumi Ku, Yokohama, Kanagawa 2300045, Japan
关键词
D O I
10.1093/bioinformatics/btp527
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Motivation: Next-generation parallel sequencing technologies produce large quantities of short sequence reads. Due to experimental procedures various types of artifacts are commonly sequenced alongside the targeted RNA or DNA sequences. Identification of such artifacts is important during the development of novel sequencing assays and for the downstream analysis of the sequenced libraries. Results: Here we present TagDust, a program identifying artifactual sequences in large sequencing runs. Given a user-defined cutoff for the false discovery rate, TagDust identifies all reads explainable by combinations and partial matches to known sequences used during library preparation. We demonstrate the quality of our method on sequencing runs performed on Illumina's Genome Analyzer platform.
引用
收藏
页码:2839 / 2840
页数:2
相关论文
共 7 条
[1]
CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[2]
Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features [J].
Lassmann, Timo ;
Frings, Oliver ;
Sonnhammer, Erik L. L. .
NUCLEIC ACIDS RESEARCH, 2009, 37 (03) :858-865
[3]
The impact of next-generation sequencing technology on genetics [J].
Mardis, Elaine R. .
TRENDS IN GENETICS, 2008, 24 (03) :133-141
[4]
MUTH R, 1996, LNCS, V1075, P75
[5]
Next-generation sequencing: The race is on [J].
von Bubnoff, Andreas .
CELL, 2008, 132 (05) :721-723
[6]
Figaro: a novel statistical method for vector sequence removal [J].
White, James Robert ;
Roberts, Michael ;
Yorke, James A. ;
Pop, Mihai .
BIOINFORMATICS, 2008, 24 (04) :462-467
[7]
Modeling ChIP Sequencing In Silico with Applications [J].
Zhang, Zhengdong D. ;
Rozowsky, Joel ;
Snyder, Michael ;
Chang, Joseph ;
Gerstein, Mark .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (08)