More than 9,000,000 Unique Genes in Human Gut Bacterial Community: Estimating Gene Numbers Inside a Human Body

被引:106
作者
Yang, Xing
Xie, Lu
Li, Yixue
Wei, Chaochun
机构
[1] Shanghai Center for Bioinformation Technology, Shanghai
[2] School of Life Science and Technology, Tongji University, Shanghai
[3] Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai
[4] Bioinformation Center, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai
[5] Lab of Molecular Microbial Ecology and Ecogenomics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai
来源
PLOS ONE | 2009年 / 4卷 / 06期
关键词
MICROBIOME; OBESITY; DIVERSITY; DIET;
D O I
10.1371/journal.pone.0006074
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Estimating the number of genes in human genome has been long an important problem in computational biology. With the new conception of considering human as a super-organism, it is also interesting to estimate the number of genes in this human super-organism. Principal Findings: We presented our estimation of gene numbers in the human gut bacterial community, the largest microbial community inside the human super-organism. We got 552,700 unique genes from 202 complete human gut bacteria genomes. Then, a novel gene counting model was built to check the total number of genes by combining culture-independent sequence data and those complete genomes. 16S rRNAs were used to construct a three-level tree and different counting methods were introduced for the three levels: strain-to-species, species-to-genus, and genus-and-up. The model estimates that the total number of genes is about 9,000,000 after those with identity percentage of 97% or up were merged. Conclusion: By combining completed genomes currently available and culture-independent sequencing data, we built a model to estimate the number of genes in human gut bacterial community. The total number of genes is estimated to be about 9 million. Although this number is huge, we believe it is underestimated. This is an initial step to tackle this gene counting problem for the human super-organism. It will still be an open problem in the near future. The list of genomes used in this paper can be found in the supplementary table.
引用
收藏
页数:8
相关论文
共 24 条
[1]   At least 1 in 20 16S rRNA sequence records currently held in public repositories is estimated to contain substantial anomalies [J].
Ashelford, KE ;
Chuzhanova, NA ;
Fry, JC ;
Jones, AJ ;
Weightman, AJ .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2005, 71 (12) :7724-7736
[2]   The gut microbiota as an environmental factor that regulates fat storage [J].
Bäckhed, F ;
Ding, H ;
Wang, T ;
Hooper, LV ;
Koh, GY ;
Nagy, A ;
Semenkovich, CF ;
Gordon, JI .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (44) :15718-15723
[3]   Host-bacterial mutualism in the human intestine [J].
Bäckhed, F ;
Ley, RE ;
Sonnenburg, JL ;
Peterson, DA ;
Gordon, JI .
SCIENCE, 2005, 307 (5717) :1915-1920
[4]   Mechanisms underlying the resistance to diet-induced obesity in germ-free mice [J].
Backhed, Fredrik ;
Manchester, Jill K. ;
Semenkovich, Clay F. ;
Gordon, Jeffrey I. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (03) :979-984
[5]  
Benson DA, 2017, NUCLEIC ACIDS RES, V45, pD37, DOI [10.1093/nar/gkl986, 10.1093/nar/gkw1070, 10.1093/nar/gkg057, 10.1093/nar/gks1195, 10.1093/nar/gkp1024, 10.1093/nar/gkq1079, 10.1093/nar/gkr1202, 10.1093/nar/gkx1094, 10.1093/nar/gkn723]
[6]   Metabolic endotoxemia initiates obesity and insulin resistance [J].
Cani, Patrice D. ;
Amar, Jacques ;
Iglesias, Miguel Angel ;
Poggi, Marjorie ;
Knauf, Claude ;
Bastelica, Delphine ;
Neyrinck, Audrey M. ;
Fava, Francesca ;
Tuohy, Kieran M. ;
Chabo, Chantal ;
Waget, Aurelie ;
Delmee, Evelyne ;
Cousin, Beatrice ;
Sulpice, Thierry ;
Chamontin, Bernard ;
Ferrieres, Jean ;
Tanti, Jean-Francois ;
Gibson, Glenn R. ;
Casteilla, Louis ;
Delzenne, Nathalie M. ;
Alessi, Marie Christine ;
Burcelin, Remy .
DIABETES, 2007, 56 (07) :1761-1772
[7]   Variations in DNA elucidate molecular networks that cause disease [J].
Chen, Yanqing ;
Zhu, Jun ;
Lum, Pek Yee ;
Yang, Xia ;
Pinto, Shirly ;
MacNeil, Douglas J. ;
Zhang, Chunsheng ;
Lamb, John ;
Edwards, Stephen ;
Sieberts, Solveig K. ;
Leonardson, Amy ;
Castellini, Lawrence W. ;
Wang, Susanna ;
Champy, Marie-France ;
Zhang, Bin ;
Emilsson, Valur ;
Doss, Sudheer ;
Ghazalpour, Anatole ;
Horvath, Steve ;
Drake, Thomas A. ;
Lusis, Aldons J. ;
Schadt, Eric E. .
NATURE, 2008, 452 (7186) :429-435
[8]   The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data [J].
Cole, J. R. ;
Chai, B. ;
Farris, R. J. ;
Wang, Q. ;
Kulam-Syed-Mohideen, A. S. ;
McGarrell, D. M. ;
Bandela, A. M. ;
Cardenas, E. ;
Garrity, G. M. ;
Tiedje, J. M. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D169-D172
[9]   The Ribosomal Database Project: improved alignments and new tools for rRNA analysis [J].
Cole, J. R. ;
Wang, Q. ;
Cardenas, E. ;
Fish, J. ;
Chai, B. ;
Farris, R. J. ;
Kulam-Syed-Mohideen, A. S. ;
McGarrell, D. M. ;
Marsh, T. ;
Garrity, G. M. ;
Tiedje, J. M. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D141-D145
[10]   Assessment of the total number of human transcription units [J].
Das, M ;
Burge, CB ;
Park, E ;
Colinas, J ;
Pelletier, J .
GENOMICS, 2001, 77 (1-2) :71-78