Estimation of prokaryote genomic DNA G+C content by sequencing universally conserved genes

被引:37
作者
Fournier, Pierre-Edouard
Suhre, Karsten
Fournous, Ghislain
Raoult, Didier
机构
[1] CNRS, UPR2589, F-13288 Marseille 09, France
[2] Univ Mediterranee, Fac Med, CNRS, Unite Riskettsies,IFR 48,UMR 6020, F-13385 Marseille 05, France
关键词
D O I
10.1099/ijs.0.63903-0
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
0 Determination of the DNA G+C content of prokaryotic genomes using traditional methods is time-consuming and results may vary from laboratory to laboratory, depending on the technique used. We explored the possibility of extrapolating the genomic DNA G+C content of prokaryotes from gene sequences. For this, 127 universally conserved genes were studied from 50 prokaryotic genomes in the Clusters of Orthologous Groups database. Of these, 57 genes were present as a single copy in the genomes of 157 different prokaryote species available in GenBank. There was a strong correlation [coefficient of determination (r(2)) > 95%] between the DNA G + C contents of 20 genes and their corresponding genomes. For each of the 157 prokaryotic genomes studied, the DNA G + C content of the 20 genes was used to determine a 'calculated' genome DNA G+C content (CGC) and this value was compared with the 'real' genome DNA G+C content (RGC). In order to select the most suitable gene for the determination of CGC values, we compared the r(2) and median mol% difference between CGC and RGC as well as the sensitivity of each gene to provide CGC values for prokaryotic genomes that differ by less than 5 mol% from their RGC. The highly conserved fts Y gene (median size 1144 nucleotides), a vertically inherited member of the GTPase superfamily, showed the highest r(2) value of 0(.)98, the smallest median mol% difference between CGC and RGC of 1.06 and a sensitivity of 100 %. Using ftsY DNA G+C content values, the CGC values of 100 genomes not included in the calculation of r(2) differed by less than 5 mol% from their RGC values. These data suggest that the genomic DNA G + C content of prokaryotes may be estimated easily and reliably from the ftsY gene sequence.
引用
收藏
页码:1025 / 1029
页数:5
相关论文
共 38 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
Bailey N.T. J., 1995, Statistical methods in biology, VThird
[3]  
BENSON CE, 2001, MOL MICROBIOL, V41, P289
[4]   The general protein secretory pathway: phylogenetic analyses leading to evolutionary conclusions [J].
Cao, TB ;
Saier, MH .
BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES, 2003, 1609 (01) :115-125
[5]   Multisubunit RNA polymerases [J].
Cramer, P .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2002, 12 (01) :89-97
[6]   Reexamination of the association between melting point, buoyant density, and chemical base composition of deoxyribonucleic acid [J].
De Ley, J. .
JOURNAL OF BACTERIOLOGY, 1970, 101 (03) :738-754
[7]   Genomic signature: Characterization and classification of species assessed by chaos game representation of sequences [J].
Deschavanne, PJ ;
Giron, A ;
Vilain, J ;
Fagot, G ;
Fertil, B .
MOLECULAR BIOLOGY AND EVOLUTION, 1999, 16 (10) :1391-1399
[8]   rpoB gene sequence-based identification of aerobic gram-positive cocci of the genera Streptococcus, Enterococcus, Gemella, Abiotrophia, and Granulicatella [J].
Drancourt, M ;
Roux, V ;
Fournier, PE ;
Raoult, D .
JOURNAL OF CLINICAL MICROBIOLOGY, 2004, 42 (02) :497-504
[9]   RAPID PROCEDURE TO DETERMINE THE DNA-BASE COMPOSITION FROM SMALL AMOUNTS OF GRAM-POSITIVE BACTERIA [J].
EZAKI, T ;
SAIDI, SM ;
LIU, SL ;
HASHIMOTO, Y ;
YAMAMOTO, H ;
YABUUCHI, E .
FEMS MICROBIOLOGY LETTERS, 1990, 67 (1-2) :127-130
[10]   Chargaff's legacy [J].
Forsdyke, DR ;
Mortimer, JR .
GENE, 2000, 261 (01) :127-137