Strainer: software for analysis of population variation in community genomic datasets

被引:26
作者
Eppley, John M.
Tyson, Gene W.
Getz, Wayne M.
Banfield, Jillian F. [1 ]
机构
[1] Univ Calif Berkeley, Dept Environm Sci Policy & Management, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Bioengn, Berkeley, CA 94720 USA
[3] MIT, Dept Civil & Environm Engn, Cambridge, MA 02139 USA
关键词
D O I
10.1186/1471-2105-8-398
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Metagenomic analyses of microbial communities that are comprehensive enough to provide multiple samples of most loci in the genomes of the dominant organism types will also reveal patterns of genetic variation within natural populations. New bioinformatic tools will enable visualization and comprehensive analysis of this sequence variation and inference of recent evolutionary and ecological processes. Results: We have developed a software package for analysis and visualization of genetic variation in populations and reconstruction of strain variants from otherwise co-assembled sequences. Sequencing reads can be clustered by matching patterns of single nucleotide polymorphisms to generate predicted gene and protein variant sequences, identify conserved intergenic regulatory sequences, and determine the quantity and distribution of recombination events. Conclusion: The Strainer software, a first generation metagenomic bioinformatics tool, facilitates comprehension and analysis of heterogeneity intrinsic in natural communities. The program reveals the degree of clustering among closely related sequence variants and provides a rapid means to generate gene and protein sequences for functional, ecological, and evolutionary analyses.
引用
收藏
页数:11
相关论文
共 23 条
[1]  
ALLEN EE, 2007, PNAS, V107, P1883
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   Proteogenomic approaches for the molecular characterization of natural microbial communities [J].
Banfield, JF ;
Verberkmoes, NC ;
Hettich, RL ;
Thelen, MP .
OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2005, 9 (04) :301-333
[4]   GenBank [J].
Benson, Dennis A. ;
Karsch-Mizrachi, Ilene ;
Lipman, David J. ;
Ostell, James ;
Wheeler, David L. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D16-D20
[5]   Microbial population genomics and ecology: the road ahead [J].
DeLong, EF .
ENVIRONMENTAL MICROBIOLOGY, 2004, 6 (09) :875-878
[6]   Genetic exchange across a species boundary in the archaeal genus ferroplasma [J].
Eppley, John M. ;
Tyson, Gene W. ;
Getz, Wayne M. ;
Banfield, Jillian F. .
GENETICS, 2007, 177 (01) :407-416
[7]   Base-calling of automated sequencer traces using phred.: II.: Error probabilities [J].
Ewing, B ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :186-194
[8]   Reverse methanogenesis: Testing the hypothesis with environmental genomics [J].
Hallam, SJ ;
Putnam, N ;
Preston, CM ;
Detter, JC ;
Rokhsar, D ;
Richardson, PM ;
DeLong, EF .
SCIENCE, 2004, 305 (5689) :1457-1462
[9]  
Hugenholtz P, 2002, GENOME BIOL, V3
[10]   Inference of population genetic parameters in metagenomics: A clean look at messy data [J].
Johnson, Philip L. F. ;
Slatkin, Montgomery .
GENOME RESEARCH, 2006, 16 (10) :1320-1327