A poor man's BLASTX-high-throughput metagenomic protein database search using PAUDA

被引:40
作者
Huson, Daniel H. [1 ,2 ]
Xie, Chao [1 ,3 ]
机构
[1] Nanyang Technol Univ, Sch Biol Sci, Singapore Ctr Environm Life Sci Engn, Singapore 637551, Singapore
[2] Univ Tubingen, Ctr Bioinformat, D-72076 Tubingen, Germany
[3] Natl Univ Singapore, Inst Life Sci, Singapore 117456, Singapore
基金
新加坡国家研究基金会;
关键词
ALIGNMENT;
D O I
10.1093/bioinformatics/btt254
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs similar to 10 000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profiles that are highly correlated to those obtained with BLASTX. PAUDA requires <80 CPU hours to analyze a dataset of 246 million Illumina DNA reads from permafrost soil for which a previous BLASTX analysis (on a subset of 176 million reads) reportedly required 800 000 CPU hours, leading to the same clustering of samples by functional profiles.
引用
收藏
页码:38 / 39
页数:2
相关论文
共 8 条
  • [1] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [2] Molecular biological access to the chemistry of unknown soil microbes: A new frontier for natural products
    Handelsman, J
    Rondon, MR
    Brady, SF
    Clardy, J
    Goodman, RM
    [J]. CHEMISTRY & BIOLOGY, 1998, 5 (10): : R245 - R249
  • [3] Integrative analysis of environmental sequences using MEGAN4
    Huson, Daniel H.
    Mitra, Suparna
    Ruscheweyh, Hans-Joachim
    Weber, Nico
    Schuster, Stephan C.
    [J]. GENOME RESEARCH, 2011, 21 (09) : 1552 - 1560
  • [4] KEGG: Kyoto Encyclopedia of Genes and Genomes
    Kanehisa, M
    Goto, S
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 27 - 30
  • [5] Langmead B, 2012, NAT METHODS, V9, P357, DOI [10.1038/NMETH.1923, 10.1038/nmeth.1923]
  • [6] Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw
    Mackelprang, Rachel
    Waldrop, Mark P.
    DeAngelis, Kristen M.
    David, Maude M.
    Chavarria, Krystle L.
    Blazewicz, Steven J.
    Rubin, Edward M.
    Jansson, Janet K.
    [J]. NATURE, 2011, 480 (7377) : 368 - U120
  • [7] Comparison of multiple metagenomes using phylogenetic networks based on ecological indices
    Mitra, Suparna
    Gilbert, Jack A.
    Field, Dawn
    Huson, Daniel H.
    [J]. ISME JOURNAL, 2010, 4 (10) : 1236 - 1242
  • [8] RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data
    Zhao, Yongan
    Tang, Haixu
    Ye, Yuzhen
    [J]. BIOINFORMATICS, 2012, 28 (01) : 125 - 126