Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains

被引:171
作者
Hogg, Justin S.
Hu, Fen Z. [1 ]
Janto, Benjamin
Boissy, Robert
Hayes, Jay
Keefe, Randy
Post, J. Christopher
Ehrlich, Garth D.
机构
[1] Allegheny Gen Hosp, Allegheny Singer Res Inst, Ctr Genom Sci, Pittsburgh, PA 15212 USA
[2] Univ Pittsburgh, Joint Carnegie Mellon Univ, PhD Program Computat Biol, Pittsburgh, PA 15260 USA
关键词
D O I
10.1186/gb-2007-8-6-r103
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: The distributed genome hypothesis (DGH) posits that chronic bacterial pathogens utilize polyclonal infection and reassortment of genic characters to ensure persistence in the face of adaptive host defenses. Studies based on random sequencing of multiple strain libraries suggested that free-living bacterial species possess a supragenome that is much larger than the genome of any single bacterium. Results: We derived high depth genomic coverage of nine nontypeable Haemophilus influenzae (NTHi) clinical isolates, bringing to 13 the number of sequenced NTHi genomes. Clustering identified 2,786 genes, of which 1,461 were common to all strains, with each of the remaining 1,328 found in a subset of strains; the number of clusters ranged from 1,686 to 1,878 per strain. Genic differences of between 96 and 585 were identified per strain pair. Comparisons of each of the NTHi strains with the Rd strain revealed between 107 and 158 insertions and 100 and 213 deletions per genome. The mean insertion and deletion sizes were 1,356 and 1,020 base-pairs, respectively, with mean maximum insertions and deletions of 26,977 and 37,299 base-pairs. This relatively large number of small rearrangements among strains is in keeping with what is known about the transformation mechanisms in this naturally competent pathogen. Conclusion: A finite supragenome model was developed to explain the distribution of genes among strains. The model predicts that the NTHi supragenome contains between 4,425 and 6,052 genes with most uncertainty regarding the number of rare genes, those that have a frequency of < 0.1 among strains; collectively, these results support the DGH.
引用
收藏
页数:18
相关论文
共 52 条
  • [1] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [2] [Anonymous], 1989, Cladistics, DOI DOI 10.1111/J.1096-0031.1989.TB00562.X
  • [3] CLONING, EXPRESSION, AND DNA-SEQUENCE ANALYSIS OF GENES ENCODING NONTYPABLE HAEMOPHILUS-INFLUENZAE HIGH-MOLECULAR-WEIGHT SURFACE-EXPOSED PROTEINS RELATED TO FILAMENTOUS HEMAGGLUTININ OF BORDETELLA-PERTUSSIS
    BARENKAMP, SJ
    LEININGER, E
    [J]. INFECTION AND IMMUNITY, 1992, 60 (04) : 1302 - 1313
  • [4] AN 11-BASE-PAIR SEQUENCE DETERMINES THE SPECIFICITY OF DNA UPTAKE IN HEMOPHILUS TRANSFORMATION
    DANNER, DB
    DEICH, RA
    SISCO, KL
    SMITH, HO
    [J]. GENE, 1980, 11 (3-4) : 311 - 318
  • [5] Alignment of whole genomes
    Delcher, AL
    Kasif, S
    Fleischmann, RD
    Peterson, J
    White, O
    Salzberg, SL
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (11) : 2369 - 2376
  • [6] Deonier R.C., 2005, COMPUTATIONAL GENOME
  • [7] Bacterial plurality as a general mechanism driving persistence in chronic infections
    Ehrlich, GD
    Hu, FZ
    Shen, K
    Stoodley, P
    Post, JC
    [J]. CLINICAL ORTHOPAEDICS AND RELATED RESEARCH, 2005, (437) : 20 - 24
  • [8] Mucosal biofilm formation on middle-ear mucosa in the chinchilla model of otitis media
    Ehrlich, GD
    Veeh, R
    Wang, X
    Costerton, JW
    Hayes, JD
    Hu, FZ
    Daigle, BJ
    Ehrlich, MD
    Post, JC
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2002, 287 (13): : 1710 - 1715
  • [9] Characterization of genetic and phenotypic diversity of invasive nontypeable Haemophilus influenzae
    Erwin, AL
    Nelson, KL
    Mhlanga-Mutangadura, T
    Bonthuis, PJ
    Geelhood, JL
    Morlin, G
    Unrath, WCT
    Campos, J
    Crook, DW
    Farley, MM
    Henderson, FW
    Jacobs, RF
    Mühlemann, K
    Satola, SW
    van Alphen, L
    Golomb, M
    Smith, AL
    [J]. INFECTION AND IMMUNITY, 2005, 73 (09) : 5853 - 5863
  • [10] Base-calling of automated sequencer traces using phred.: II.: Error probabilities
    Ewing, B
    Green, P
    [J]. GENOME RESEARCH, 1998, 8 (03): : 186 - 194