Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for analysis of environmental sequencing data

被引:1053
作者
Bengtsson-Palme, Johan [1 ]
Ryberg, Martin [2 ]
Hartmann, Martin [3 ,4 ]
Branco, Sara [5 ]
Wang, Zheng [6 ]
Godhe, Anna [7 ]
De Wit, Pierre [7 ]
Sanchez-Garcia, Marisol [8 ]
Ebersberger, Ingo [9 ]
de Sousa, Filipe [7 ]
Amend, Anthony S. [10 ]
Jumpponen, Ari [11 ]
Unterseher, Martin [12 ]
Kristiansson, Erik [13 ]
Abarenkov, Kessy [14 ]
Bertrand, Yann J. K. [7 ]
Sanli, Kemal [7 ]
Eriksson, K. Martin [15 ]
Vik, Unni [16 ]
Veldre, Vilmar
Nilsson, R. Henrik [7 ]
机构
[1] Univ Gothenburg, Sahlgrenska Acad, Dept Neurosci & Physiol, S-40530 Gothenburg, Sweden
[2] Uppsala Univ, Dept Organismal Biol, S-75236 Uppsala, Sweden
[3] Swiss Fed Res Inst WSL, CH-8903 Birmensdorf, Switzerland
[4] Agroscope Reckenholz Tanikon Res Stn ART, CH-8046 Zurich, Switzerland
[5] Univ Calif Berkeley, Dept Plant & Microbial Biol, Berkeley, CA 94720 USA
[6] Yale Univ, Dept Ecol & Evolutionary Biol, New Haven, CT 06520 USA
[7] Univ Gothenburg, Dept Biol & Environm Sci, S-40530 Gothenburg, Sweden
[8] Univ Tennessee, Dept Ecol & Evolutionary Biol, Knoxville, TN 37996 USA
[9] Goethe Univ Frankfurt, Inst Cell Biol & Neurosci, Dept Appl Bioinformat, D-60438 Frankfurt, Germany
[10] Univ Hawaii Manoa, Dept Bot, Honolulu, HI 96822 USA
[11] Kansas State Univ, Div Biol, Manhattan, KS 66506 USA
[12] Ernst Moritz Arndt Univ Greifswald, Inst Bot & Landscape Ecol, D-17487 Greifswald, Germany
[13] Chalmers Univ Technol, Dept Math Stat, S-41296 Gothenburg, Sweden
[14] Univ Tartu, Nat Hist Museum, EE-51014 Tartu, Estonia
[15] Chalmers Univ Technol, Dept Shipping & Marine Technol, S-41296 Gothenburg, Sweden
[16] Univ Oslo, Dept Biosci, N-0316 Oslo, Norway
来源
METHODS IN ECOLOGY AND EVOLUTION | 2013年 / 4卷 / 10期
关键词
fungi; molecular ecology; next-generation sequencing; Perl; ribosomal DNA; INTERNAL TRANSCRIBED SPACER; DNA BARCODE; DIVERSITY; ECOLOGY; REGION; EVOLUTION;
D O I
10.1111/2041-210X.12073
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
The nuclear ribosomal internal transcribed spacer (ITS) region is the primary choice for molecular identification of fungi. Its two highly variable spacers (ITS1 and ITS2) are usually species specific, whereas the intercalary 5.8S gene is highly conserved. For sequence clustering and blast searches, it is often advantageous to rely on either one of the variable spacers but not the conserved 5.8S gene. To identify and extract ITS1 and ITS2 from large taxonomic and environmental data sets is, however, often difficult, and many ITS sequences are incorrectly delimited in the public sequence databases. We introduce ITSx, a Perl-based software tool to extract ITS1, 5.8S and ITS2 - as well as full-length ITS sequences - from both Sanger and high-throughput sequencing data sets. ITSx uses hidden Markov models computed from large alignments of a total of 20 groups of eukaryotes, including fungi, metazoans and plants, and the sequence extraction is based on the predicted positions of the ribosomal genes in the sequences. ITSx has a very high proportion of true-positive extractions and a low proportion of false-positive extractions. Additionally, process parallelization permits expedient analyses of very large data sets, such as a one million sequence amplicon pyrosequencing data set. ITSx is rich in features and written to be easily incorporated into automated sequence analysis pipelines. ITSx paves the way for more sensitive blast searches and sequence clustering operations for the ITS region in eukaryotes. The software also permits elimination of non-ITS sequences from any data set. This is particularly useful for amplicon-based next-generation sequencing data sets, where insidious non-target sequences are often found among the target sequences. Such non-target sequences are difficult to find by other means and would contribute noise to diversity estimates if left in the data set.
引用
收藏
页码:914 / 919
页数:6
相关论文
共 49 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Current state and perspectives of fungal DNA barcoding and rapid identification procedures [J].
Begerow, Dominik ;
Nilsson, Henrik ;
Unterseher, Martin ;
Maier, Wolfgang .
APPLIED MICROBIOLOGY AND BIOTECHNOLOGY, 2010, 87 (01) :99-108
[3]   ITS as an environmental DNA barcode for fungi: an in silico approach reveals potential PCR biases [J].
Bellemain, Eva ;
Carlsen, Tor ;
Brochmann, Christian ;
Coissac, Eric ;
Taberlet, Pierre ;
Kauserud, Havard .
BMC MICROBIOLOGY, 2010, 10
[4]   Megraft: a software package to graft ribosomal small subunit (16S/18S) fragments onto full-length sequences for accurate species richness and sequencing depth analysis in pyrosequencing-length metagenomes and similar environmental datasets [J].
Bengtsson, Johan ;
Hartmann, Martin ;
Unterseher, Martin ;
Vaishampayan, Parag ;
Abarenkov, Kessy ;
Durso, Lisa ;
Bik, Elisabeth M. ;
Garey, James R. ;
Eriksson, K. Martin ;
Nilsson, R. Henrik .
RESEARCH IN MICROBIOLOGY, 2012, 163 (6-7) :407-412
[5]   Metaxa: a software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets [J].
Bengtsson, Johan ;
Eriksson, K. Martin ;
Hartmann, Martin ;
Wang, Zheng ;
Shenoy, Belle Damodara ;
Grelet, Gwen-Aelle ;
Abarenkov, Kessy ;
Petri, Anna ;
Rosenblad, Magnus Alm ;
Nilsson, R. Henrik .
ANTONIE VAN LEEUWENHOEK INTERNATIONAL JOURNAL OF GENERAL AND MOLECULAR MICROBIOLOGY, 2011, 100 (03) :471-475
[6]   The International Nucleotide Sequence Database Collaboration [J].
Cochrane, Guy ;
Karsch-Mizrachi, Ilene ;
Nakamura, Yasukazu .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D15-D18
[7]   Post-genomic approaches to understanding interactions between fungi and their environment [J].
de Vries, Ronald P. ;
Benoit, Isabelle ;
Doehlemann, Gunther ;
Kobayashi, Tetsuo ;
Magnuson, Jon K. ;
Panisko, Ellen A. ;
Baker, Scott E. ;
Lebrun, Marc-Henri .
IMA FUNGUS, 2011, 2 (01) :81-86
[8]   Insidious effects of sequencing errors on perceived diversity in molecular surveys [J].
Dickie, Ian A. .
NEW PHYTOLOGIST, 2010, 188 (04) :916-918
[9]  
Durbin R., 1998, Biological sequence analysis: probabilistic models of proteins and nucleic acids
[10]   A constructive step towards selecting a DNA barcode for fungi [J].
Eberhardt, Ursula .
NEW PHYTOLOGIST, 2010, 187 (02) :266-268