UCHIME improves sensitivity and speed of chimera detection

被引:13501
作者
Edgar, Robert C.
Haas, Brian J. [1 ]
Clemente, Jose C. [2 ]
Quince, Christopher [3 ]
Knight, Rob [2 ]
机构
[1] Broad Inst, Genome Sequencing & Anal Program, Cambridge, MA 02142 USA
[2] Univ Colorado, Dept Chem & Biochem, Boulder, CO 80309 USA
[3] Univ Glasgow, Sch Engn, Glasgow G12 8LT, Lanark, Scotland
基金
英国工程与自然科学研究理事会; 美国国家卫生研究院;
关键词
RIBOSOMAL-RNA GENES; PCR COAMPLIFICATION; SEQUENCE ALIGNMENT; NEW-GENERATION; CONSEQUENCE; FREQUENCY; MOLECULES; PROGRAM; SEARCH; BLAST;
D O I
10.1093/bioinformatics/btr381
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Motivation: Chimeric DNA sequences often form during polymerase chain reaction amplification, especially when sequencing single regions (e.g. 16S rRNA or fungal Internal Transcribed Spacer) to assess diversity or compare populations. Undetected chimeras may be misinterpreted as novel species, causing inflated estimates of diversity and spurious inferences of differences between populations. Detection and removal of chimeras is therefore of critical importance in such experiments. Results: We describe UCHIME, a new program that detects chimeric sequences with two or more segments. UCHIME either uses a database of chimera-free sequences or detects chimeras de novo by exploiting abundance data. UCHIME has better sensitivity than ChimeraSlayer (previously the most sensitive database method), especially with short, noisy sequences. In testing on artificial bacterial communities with known composition, UCHIME de novo sensitivity is shown to be comparable to Perseus. UCHIME is > 100x faster than Perseus and > 1000x faster than ChimeraSlayer.
引用
收藏
页码:2194 / 2200
页数:7
相关论文
共 19 条
[1]
PCR-induced sequence artifacts and bias: Insights from comparison of two 16S rRNA clone libraries constructed from the same sample [J].
Acinas, SG ;
Sarma-Rupavtarm, R ;
Klepac-Ceraj, V ;
Polz, MF .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2005, 71 (12) :8966-8969
[2]
TREES, STARS, AND MULTIPLE BIOLOGICAL SEQUENCE ALIGNMENT [J].
ALTSCHUL, SF ;
LIPMAN, DJ .
SIAM JOURNAL ON APPLIED MATHEMATICS, 1989, 49 (01) :197-209
[3]
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[4]
At least 1 in 20 16S rRNA sequence records currently held in public repositories is estimated to contain substantial anomalies [J].
Ashelford, KE ;
Chuzhanova, NA ;
Fry, JC ;
Jones, AJ ;
Weightman, AJ .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2005, 71 (12) :7724-7736
[5]
New screening software shows that most recent large 16S rRNA gene clone libraries contain chimeras [J].
Ashelford, Kevin E. ;
Chuzhanova, Nadia A. ;
Fry, John C. ;
Jones, Antonia J. ;
Weightman, Andrew J. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2006, 72 (09) :5734-5741
[6]
Durbin R., 1998, Analysis, V356, DOI [10.1017/CBO9780511790492, DOI 10.1017/CBO9780511790492]
[7]
Search and clustering orders of magnitude faster than BLAST [J].
Edgar, Robert C. .
BIOINFORMATICS, 2010, 26 (19) :2460-2461
[8]
Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons [J].
Haas, Brian J. ;
Gevers, Dirk ;
Earl, Ashlee M. ;
Feldgarden, Mike ;
Ward, Doyle V. ;
Giannoukos, Georgia ;
Ciulla, Dawn ;
Tabbaa, Diana ;
Highlander, Sarah K. ;
Sodergren, Erica ;
Methe, Barbara ;
DeSantis, Todd Z. ;
Petrosino, Joseph F. ;
Knight, Rob ;
Birren, Bruce W. .
GENOME RESEARCH, 2011, 21 (03) :494-504
[9]
Bellerophon: a program to detect chimeric sequences in multiple sequence alignments [J].
Huber, T ;
Faulkner, G ;
Hugenholtz, P .
BIOINFORMATICS, 2004, 20 (14) :2317-2319
[10]
Recent developments in the MAFFT multiple sequence alignment program [J].
Katoh, Kazutaka ;
Toh, Hiroyuki .
BRIEFINGS IN BIOINFORMATICS, 2008, 9 (04) :286-298