Whole genome assembly of a natto production strain Bacillus subtilis natto from very short read data

被引:80
作者
Nishito, Yukari [1 ]
Osana, Yasunori [2 ]
Hachiya, Tsuyoshi [1 ]
Popendorf, Kris [1 ]
Toyoda, Atsushi [3 ]
Fujiyama, Asao [4 ]
Itaya, Mitsuhiro [5 ]
Sakakibara, Yasubumi [1 ]
机构
[1] Keio Univ, Dept Biosci & Informat, Kohoku Ku, Yokohama, Kanagawa 223, Japan
[2] Seikei Univ, Dept Comp & Informat Sci, Tokyo, Japan
[3] Natl Inst Genet, Ctr Genet Resource Informat, Shizuoka, Japan
[4] Natl Inst Informat, Principles Informat Res Div, Tokyo, Japan
[5] Keio Univ, Inst Adv Biosci, Tokyo, Japan
来源
BMC GENOMICS | 2010年 / 11卷
基金
日本科学技术振兴机构;
关键词
SEQUENCE; GENES; DNA; IDENTIFICATION; MECHANISM; PLASMIDS; ACID;
D O I
10.1186/1471-2164-11-243
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and functions as a starter for the production of the traditional Japanese food "natto" made from soybeans. Although resequencing whole genomes of several laboratory domesticated B. subtilis 168 derivatives has already been attempted using short read sequencing data, the assembly of the whole genome sequence of a closely related strain, B. subtilis natto, from very short read data is more challenging, particularly with our aim to assemble one fully connected scaffold from short reads around 35 bp in length. Results: We applied a comparative genome assembly method, which combines de novo assembly and reference guided assembly, to one of the B. subtilis natto strains. We successfully assembled 28 scaffolds and managed to avoid substantial fragmentation. Completion of the assembly through long PCR experiments resulted in one connected scaffold for B. subtilis natto. Based on the assembled genome sequence, our orthologous gene analysis between natto BEST195 and Marburg 168 revealed that 82.4% of 4375 predicted genes in BEST195 are one-to-one orthologous to genes in 168, with two genes in-paralog, 3.2% are deleted in 168, 14.3% are inserted in BEST195, and 5.9% of genes present in 168 are deleted in BEST195. The natto genome contains the same alleles in the promoter region of degQ and the coding region of swrAA as the wild strain, RO-FF-1. These are specific for gamma-PGA production ability, which is related to natto production. Further, the B. subtilis natto strain completely lacked a polyketide synthesis operon, disrupted the plipastatin production operon, and possesses previously unidentified transposases. Conclusions: The determination of the whole genome sequence of Bacillus subtilis natto provided detailed analyses of a set of genes related to natto production, demonstrating the number and locations of insertion sequences that B. subtilis natto harbors but B. subtilis 168 lacks. Multiple genome-level comparisons among five closely related Bacillus species were also carried out. The determined genome sequence of B. subtilis natto and gene annotations are available from the Natto genome browser http://natto-genome.org/.
引用
收藏
页数:12
相关论文
共 40 条
  • [1] Physiological and biochemical characteristics of poly γ-glutamate synthetase complex of Bacillus subtilis
    Ashiuchi, M
    Nawa, C
    Kamei, T
    Song, JJ
    Hong, SP
    Sung, MH
    Soda, K
    Yagi, T
    Misono, H
    [J]. EUROPEAN JOURNAL OF BIOCHEMISTRY, 2001, 268 (20): : 5321 - 5328
  • [2] From a consortium sequence to a unified sequence: the Bacillus subtilis 168 reference genome a decade later
    Barbe, Valerie
    Cruveiller, Stephane
    Kunst, Frank
    Lenoble, Patricia
    Meurice, Guillaume
    Sekowska, Agnieszka
    Vallenet, David
    Wang, Tingzhang
    Moszer, Ivan
    Medigue, Claudine
    Danchin, Antoine
    [J]. MICROBIOLOGY-SGM, 2009, 155 : 1758 - 1775
  • [3] Short read fragment assembly of bacterial genomes
    Chaisson, Mark J.
    Pevzner, Pavel A.
    [J]. GENOME RESEARCH, 2008, 18 (02) : 324 - 330
  • [4] Comparative analysis of the complete genome sequence of the plant growth-promoting bacterium Bacillus amyloliquefaciens FZB42
    Chen, Xiao Hua
    Koumoutsi, Alexandra
    Scholz, Romy
    Eisenreich, Andreas
    Schneider, Kathrin
    Heinemeyer, Isabelle
    Morgenstern, Burkhard
    Voss, Bjoern
    Hess, Wolfgang R.
    Reva, Oleg
    Junge, Helmut
    Voigt, Birgit
    Jungblut, Peter R.
    Vater, Joachim
    Suessmuth, Roderich
    Liesegang, Heiko
    Strittmatter, Axel
    Gottschalk, Gerhard
    Borriss, Rainer
    [J]. NATURE BIOTECHNOLOGY, 2007, 25 (09) : 1007 - 1014
  • [5] Identifying bacterial genes and endosymbiont DNA with Glimmer
    Delcher, Arthur L.
    Bratke, Kirsten A.
    Powers, Edwin C.
    Salzberg, Steven L.
    [J]. BIOINFORMATICS, 2007, 23 (06) : 673 - 679
  • [6] SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing
    Dohm, Juliane C.
    Lottaz, Claudio
    Borodina, Tatiana
    Himmelbauer, Heinz
    [J]. GENOME RESEARCH, 2007, 17 (11) : 1697 - 1706
  • [7] Ecology and genomics of Bacillus subtilis
    Earl, Ashlee M.
    Losick, Richard
    Kolter, Roberto
    [J]. TRENDS IN MICROBIOLOGY, 2008, 16 (06) : 269 - 275
  • [8] De novo assembly of the Pseudomonas syringae pv. syringae B728a genome using Illumina/Solexa short sequence reads
    Farrer, Rhys A.
    Kemen, Eric
    Jones, Jonathan D. G.
    Studholme, David J.
    [J]. FEMS MICROBIOLOGY LETTERS, 2009, 291 (01) : 103 - 111
  • [9] Paradoxical DNA Repair and Peroxide Resistance Gene Conservation in Bacillus pumilus SAFR-032
    Gioia, Jason
    Yerrapragada, Shailaja
    Qin, Xiang
    Jiang, Huaiyang
    Igboeli, Okezie C.
    Muzny, Donna
    Dugan-Rocha, Shannon
    Ding, Yan
    Hawes, Alicia
    Liu, Wen
    Perez, Lesette
    Kovar, Christie
    Dinh, Huyen
    Lee, Sandra
    Nazareth, Lynne
    Blyth, Peter
    Holder, Michael
    Buhay, Christian
    Tirumalai, Madhan R.
    Liu, Yamei
    Dasgupta, Indrani
    Bokhetache, Lina
    Fujita, Masaya
    Karouia, Fathi
    Moorthy, Prahathees Eswara
    Siefert, Johnathan
    Uzman, Akif
    Buzumbo, Prince
    Verma, Avani
    Zwiya, Hiba
    McWilliams, Brian D.
    Olowu, Adeola
    Clinkenbeard, Kenneth D.
    Newcombe, David
    Golebiewski, Lisa
    Petrosino, Joseph F.
    Nicholson, Wayne L.
    Fox, George E.
    Venkateswaran, Kasthuri
    Highlander, Sarah K.
    Weinstock, George M.
    [J]. PLOS ONE, 2007, 2 (09):
  • [10] HACHIYA T, 2009, GENOMES GENOMICS, V3, P31