A supervised learning approach for taxonomic classification of core-photosystem-II genes and transcripts in the marine environment

被引:17
作者
Tzahor, Shani [1 ,2 ]
Man-Aharonovich, Dikla [1 ]
Kirkup, Benjamin C. [3 ]
Yogev, Tali [4 ]
Berman-Frank, Ilana [4 ]
Polz, Martin F. [3 ]
Beja, Oded [1 ]
Mandel-Gutfreund, Yael [1 ]
机构
[1] Technion Israel Inst Technol, Fac Biol, IL-32000 Haifa, Israel
[2] Technion Israel Inst Technol, Inter Departmental Program Biotechnol, IL-32000 Haifa, Israel
[3] MIT, Dept Civil & Environm Engn, Cambridge, MA 02139 USA
[4] Bar Ilan Univ, Fac Life Sci, IL-52900 Ramat Gan, Israel
来源
BMC GENOMICS | 2009年 / 10卷
基金
美国国家科学基金会;
关键词
PHYLOGENETIC CLASSIFICATION; PHOTOSYNTHESIS GENES; BACTERIAL GENOMES; DNA; PROCHLOROCOCCUS; SYNECHOCOCCUS; ULTRAPHYTOPLANKTON; METAGENOMICS; EVOLUTIONARY; ORGANIZATION;
D O I
10.1186/1471-2164-10-229
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Cyanobacteria of the genera Synechococcus and Prochlorococcus play a key role in marine photosynthesis, which contributes to the global carbon cycle and to the world oxygen supply. Recently, genes encoding the photosystem II reaction center (psbA and psbD) were found in cyanophage genomes. This phenomenon suggested that the horizontal transfer of these genes may be involved in increasing phage fitness. To date, a very small percentage of marine bacteria and phages has been cultured. Thus, mapping genomic data extracted directly from the environment to its taxonomic origin is necessary for a better understanding of phage-host relationships and dynamics. Results: To achieve an accurate and rapid taxonomic classification, we employed a computational approach combining a multi-class Support Vector Machine (SVM) with a codon usage position specific scoring matrix (cuPSSM). Our method has been applied successfully to classify core-photosystem-II gene fragments, including partial sequences coming directly from the ocean, to seven different taxonomic classes. Applying the method on a large set of DNA and RNA psbA clones from the Mediterranean Sea, we studied the distribution of cyanobacterial psbA genes and transcripts in their natural environment. Using our approach, we were able to simultaneously examine taxonomic and ecological distributions in the marine environment. Conclusion: The ability to accurately classify the origin of individual genes and transcripts coming directly from the environment is of great importance in studying marine ecology. The classification method presented in this paper could be applied further to classify other genes amplified from the environment, for which training data is available.
引用
收藏
页数:14
相关论文
共 53 条
  • [1] Self-Organizing Map (SOM) unveils and visualizes hidden sequence characteristics of a wide range of eukaryote genomes
    Abe, T
    Sugawara, H
    Kanaya, S
    Kinouchi, M
    Ikemura, T
    [J]. GENE, 2006, 365 : 27 - 34
  • [2] The marine viromes of four oceanic regions
    Angly, Florent E.
    Felts, Ben
    Breitbart, Mya
    Salamon, Peter
    Edwards, Robert A.
    Carlson, Craig
    Chan, Amy M.
    Haynes, Matthew
    Kelley, Scott
    Liu, Hong
    Mahaffy, Joseph M.
    Mueller, Jennifer E.
    Nulton, Jim
    Olson, Robert
    Parsons, Rachel
    Rayhawk, Steve
    Suttle, Curtis A.
    Rohwer, Forest
    [J]. PLOS BIOLOGY, 2006, 4 (11) : 2121 - 2131
  • [3] [Anonymous], R Project for Statistical Computing (Version 3.0.2)
  • [4] [Anonymous], 2002, A tutorial on Principal Components Analysis Lindsay I Smith
  • [5] Metagenomic characterization of Chesapeake bay virioplankton
    Bench, Shellie R.
    Hanson, Thomas E.
    Williamson, Kurt E.
    Ghosh, Dhritiman
    Radosovich, Mark
    Wang, Kui
    Wommack, K. Eric
    [J]. APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2007, 73 (23) : 7629 - 7641
  • [6] Codon bias signatures, organization of microorganisms in codon space, and lifestyle
    Carbone, A
    Képés, F
    Zinovyev, A
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2005, 22 (03) : 547 - 561
  • [7] Phylogenetic diversity of sequences of cyanophage photosynthetic gene psbA in marine and freshwaters
    Chenard, C.
    Suttle, C. A.
    [J]. APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2008, 74 (17) : 5317 - 5324
  • [8] Chow WS, 2005, ADV PHOTO RESPIRAT, V22, P627
  • [9] Community genomics among stratified microbial assemblages in the ocean's interior
    DeLong, EF
    Preston, CM
    Mincer, T
    Rich, V
    Hallam, SJ
    Frigaard, NU
    Martinez, A
    Sullivan, MB
    Edwards, R
    Brito, BR
    Chisholm, SW
    Karl, DM
    [J]. SCIENCE, 2006, 311 (5760) : 496 - 503
  • [10] Genomic signature: Characterization and classification of species assessed by chaos game representation of sequences
    Deschavanne, PJ
    Giron, A
    Vilain, J
    Fagot, G
    Fertil, B
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 1999, 16 (10) : 1391 - 1399