The frequency distribution of gene family sizes in complete genomes

被引:152
作者
Huynen, MA
van Nimwegen, E
机构
[1] European Mol Biol Lab, D-69117 Heidelberg, Germany
[2] Santa Fe Inst, Santa Fe, NM 87501 USA
[3] Max Delbruck Ctr Mol Med, Berlin, Germany
关键词
gene family; comparative genome analysis; power law distribution;
D O I
10.1093/oxfordjournals.molbev.a025959
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We compare the frequency distribution of gene family sizes in the complete genomes of six bacteria (Escherichia coli, Haemophilus influenzae, Helicobacter pylori, Mycoplasma genitalium, Mycoplasma pneumoniae, and Synechocystis sp. PCC6803), two Archaea (Methanococcus jannaschii and Methanobacterium thermoautotrophicum), one eukaryote (Saccharomyces cerevisiae), the vaccinia virus, and the bacteriophage T4. The sizes of the gene families versus their frequencies show power-law distributions that tend to become flatter (have a larger exponent) as the number of genes in the genome increases. Power-law distributions generally occur as the limit distribution of a multiplicative stochastic process with a boundary constraint. We discuss various models that can account for a multiplicative process determining the sizes of gene families in the genome. In particular, we argue that, in order to explain the observed distributions, gene families have to behave in a coherent fashion within the genome; i.e., the probabilities of duplications of genes within a gene family are not independent of each other. Likewise, the probabilities of deletions of genes within a gene family are not independent of each other.
引用
收藏
页码:583 / 589
页数:7
相关论文
共 25 条
  • [1] [Anonymous], COMPLEXITY, DOI DOI 10.1002/(SICI)1099-0526(199909/10)5:1<3C12::AID-CPLX2>3E3.0.CO
  • [2] 2-T
  • [3] Evolution of simple sequence repeats
    Bell, GI
    [J]. COMPUTERS & CHEMISTRY, 1996, 20 (01): : 41 - 48
  • [4] The complete genome sequence of Escherichia coli K-12
    Blattner, FR
    Plunkett, G
    Bloch, CA
    Perna, NT
    Burland, V
    Riley, M
    ColladoVides, J
    Glasner, JD
    Rode, CK
    Mayhew, GF
    Gregor, J
    Davis, NW
    Kirkpatrick, HA
    Goeden, MA
    Rose, DJ
    Mau, B
    Shao, Y
    [J]. SCIENCE, 1997, 277 (5331) : 1453 - +
  • [5] GENE DUPLICATIONS IN HAEMOPHILUS-INFLUENZAE
    BRENNER, SE
    HUBBARD, T
    MURZIN, A
    CHOTHIA, C
    [J]. NATURE, 1995, 378 (6553) : 140 - 140
  • [6] BRENNER SE, 1996, THESIS CAMBRIDGE U
  • [7] Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii
    Bult, CJ
    White, O
    Olsen, GJ
    Zhou, LX
    Fleischmann, RD
    Sutton, GG
    Blake, JA
    FitzGerald, LM
    Clayton, RA
    Gocayne, JD
    Kerlavage, AR
    Dougherty, BA
    Tomb, JF
    Adams, MD
    Reich, CI
    Overbeek, R
    Kirkness, EF
    Weinstock, KG
    Merrick, JM
    Glodek, A
    Scott, JL
    Geoghagen, NSM
    Weidman, JF
    Fuhrmann, JL
    Nguyen, D
    Utterback, TR
    Kelley, JM
    Peterson, JD
    Sadow, PW
    Hanna, MC
    Cotton, MD
    Roberts, KM
    Hurst, MA
    Kaine, BP
    Borodovsky, M
    Klenk, HP
    Fraser, CM
    Smith, HO
    Woese, CR
    Venter, JC
    [J]. SCIENCE, 1996, 273 (5278) : 1058 - 1073
  • [8] PROTEINS - 1000 FAMILIES FOR THE MOLECULAR BIOLOGIST
    CHOTHIA, C
    [J]. NATURE, 1992, 357 (6379) : 543 - 544
  • [9] WHOLE-GENOME RANDOM SEQUENCING AND ASSEMBLY OF HAEMOPHILUS-INFLUENZAE RD
    FLEISCHMANN, RD
    ADAMS, MD
    WHITE, O
    CLAYTON, RA
    KIRKNESS, EF
    KERLAVAGE, AR
    BULT, CJ
    TOMB, JF
    DOUGHERTY, BA
    MERRICK, JM
    MCKENNEY, K
    SUTTON, G
    FITZHUGH, W
    FIELDS, C
    GOCAYNE, JD
    SCOTT, J
    SHIRLEY, R
    LIU, LI
    GLODEK, A
    KELLEY, JM
    WEIDMAN, JF
    PHILLIPS, CA
    SPRIGGS, T
    HEDBLOM, E
    COTTON, MD
    UTTERBACK, TR
    HANNA, MC
    NGUYEN, DT
    SAUDEK, DM
    BRANDON, RC
    FINE, LD
    FRITCHMAN, JL
    FUHRMANN, JL
    GEOGHAGEN, NSM
    GNEHM, CL
    MCDONALD, LA
    SMALL, KV
    FRASER, CM
    SMITH, HO
    VENTER, JC
    [J]. SCIENCE, 1995, 269 (5223) : 496 - 512
  • [10] THE MINIMAL GENE COMPLEMENT OF MYCOPLASMA-GENITALIUM
    FRASER, CM
    GOCAYNE, JD
    WHITE, O
    ADAMS, MD
    CLAYTON, RA
    FLEISCHMANN, RD
    BULT, CJ
    KERLAVAGE, AR
    SUTTON, G
    KELLEY, JM
    FRITCHMAN, JL
    WEIDMAN, JF
    SMALL, KV
    SANDUSKY, M
    FUHRMANN, J
    NGUYEN, D
    UTTERBACK, TR
    SAUDEK, DM
    PHILLIPS, CA
    MERRICK, JM
    TOMB, JF
    DOUGHERTY, BA
    BOTT, KF
    HU, PC
    LUCIER, TS
    PETERSON, SN
    SMITH, HO
    HUTCHISON, CA
    VENTER, JC
    [J]. SCIENCE, 1995, 270 (5235) : 397 - 403