PhyloSift: phylogenetic analysis of genomes and metagenomes

被引:430
作者
Darling, Aaron E. [1 ,2 ]
Jospin, Guillaume [2 ]
Lowe, Eric [2 ]
Matsen, Frederick A., IV [5 ]
Bik, Holly M. [2 ]
Eisen, Jonathan A. [3 ,4 ]
机构
[1] Univ Technol Sydney, Inst I3, Sydney, NSW 2007, Australia
[2] Univ Calif Davis, Genome Ctr, Davis, CA 95616 USA
[3] Univ Calif Davis, Dept Ecol & Evolut, Davis, CA 95616 USA
[4] Univ Calif Davis, Dept Med Microbiol & Immunol, Davis, CA 95616 USA
[5] Fred Hutchinson Canc Res Ctr, Seattle, WA 98104 USA
来源
PEERJ | 2014年 / 2卷
关键词
Metagenomics; Phylogenetics; Forensics; Bayes factor; Microbial diversity; Community structure; Microbial ecology; Edge PCA; Phylogenetic diversity; Microbial evolution; MICROBIAL COMMUNITIES; MARKER GENES; SEQUENCES; BACTERIAL; DIVERSITY; CLASSIFICATION; ALIGNMENTS; ASSIGNMENT; ALGORITHM; ARCHAEA;
D O I
10.7717/peerj.243
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Like all organisms on the planet, environmental microbes are subject to the forces of molecular evolution. Metagenomic sequencing provides a means to access the DNA sequence of uncultured microbes. By combining DNA sequencing of microbial communities with evolutionary modeling and phylogenetic analysis we might obtain new insights into microbiology and also provide a basis for practical tools such as forensic pathogen detection. In this work we present an approach to leverage phylogenetic analysis of metagenomic sequence data to conduct several types of analysis. First, we present a method to conduct phylogeny-driven Bayesian hypothesis tests for the presence of an organism in a sample. Second, we present a means to compare community structure across a collection of many samples and develop direct associations between the abundance of certain organisms and sample metadata. Third, we apply new tools to analyze the phylogenetic diversity of microbial communities and again demonstrate how this can be associated to sample metadata. These analyses are implemented in an open source software pipeline called PhyloSift. As a pipeline, PhyloSift incorporates several other programs including LAST, HMMER, and pplacer to automate phylogenetic analysis of protein coding and RNA sequences in metagenomic datasets generated by modern sequencing platforms (e.g., Illumina, 454).
引用
收藏
页数:28
相关论文
共 68 条
[1]   Lateral gene transfer as a support for the tree of life [J].
Abby, Sophie S. ;
Tannier, Eric ;
Gouy, Manolo ;
Daubin, Vincent .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (13) :4962-4967
[2]   Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition [J].
Adey, Andrew ;
Morrison, Hilary G. ;
Asan ;
Xun, Xu ;
Kitzman, Jacob O. ;
Turner, Emily H. ;
Stackhouse, Bethany ;
MacKenzie, Alexandra P. ;
Caruccio, Nicholas C. ;
Zhang, Xiuqing ;
Shendure, Jay .
GENOME BIOLOGY, 2010, 11 (12)
[3]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[4]   Bacterial rhodopsin:: Evidence for a new type of phototrophy in the sea [J].
Béjà, O ;
Aravind, L ;
Koonin, EV ;
Suzuki, MT ;
Hadd, A ;
Nguyen, LP ;
Jovanovich, S ;
Gates, CM ;
Feldman, RA ;
Spudich, JL ;
Spudich, EN ;
DeLong, EF .
SCIENCE, 2000, 289 (5486) :1902-1906
[5]   Sequencing our way towards understanding global eukaryotic biodiversity [J].
Bik, Holly M. ;
Porazinska, Dorota L. ;
Creer, Simon ;
Caporaso, J. Gregory ;
Knight, Rob ;
Thomas, W. Kelley .
TRENDS IN ECOLOGY & EVOLUTION, 2012, 27 (04) :233-243
[6]   The future is now: single-cell genomics of bacteria and archaea [J].
Blainey, Paul C. .
FEMS MICROBIOLOGY REVIEWS, 2013, 37 (03) :407-427
[7]   Genome-scale coestimation of species and gene trees [J].
Boussau, Bastien ;
Szoellosi, Gergely J. ;
Duret, Laurent ;
Gouy, Manolo ;
Tannier, Eric ;
Daubin, Vincent .
GENOME RESEARCH, 2013, 23 (02) :323-330
[8]   PhymmBL expanded: confidence scores, custom databases, parallelization and more [J].
Brady, Arthur ;
Salzberg, Steven .
NATURE METHODS, 2011, 8 (05) :367-367
[9]  
Brady A, 2009, NAT METHODS, V6, P673, DOI [10.1038/nmeth.1358, 10.1038/NMETH.1358]
[10]   BLAST plus : architecture and applications [J].
Camacho, Christiam ;
Coulouris, George ;
Avagyan, Vahram ;
Ma, Ning ;
Papadopoulos, Jason ;
Bealer, Kevin ;
Madden, Thomas L. .
BMC BIOINFORMATICS, 2009, 10