Automatic detection of conserved gene clusters in multiple genomes by graph comparison and P-quasi grouping

被引:37
作者
Fujibuchi, W
Ogata, H
Matsuda, H
Kanehisa, M [1 ]
机构
[1] Kyoto Univ, Inst Chem Res, Kyoto 6110011, Japan
[2] Osaka Univ, Grad Sch Engn Sci, Osaka 5608531, Japan
关键词
D O I
10.1093/nar/28.20.4029
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We previously reported two graph algorithms for analysis of genomic information: a graph Comparison algorithm to detect locally similar regions called correlated clusters and an algorithm to find a graph feature called P-quasi complete linkage. Based on these algorithms we have developed an automatic procedure to detect conserved gene clusters and align orthologous gene orders in multiple genomes. In the first step, the graph comparison is applied to pairwise genome comparisons, where the genome is considered as a one-dimensionally connected graph with genes as its nodes, and correlated clusters of genes that share sequence similarities are identified. In the next step, the P-quasi complete linkage analysis is applied to grouping of related clusters and conserved gene clusters in multiple genomes are identified, In the last step, orthologous relations of genes are established among each conserved cluster. We analyzed 17 completely sequenced microbial genomes and obtained 2313 clusters when the completeness parameter P was 40%, About one quarter contained at least two genes that appeared in the metabolic and regulatory pathways in the KEGG database. This collection of conserved gene clusters is used to refine and augment ortholog group tables in KEGG and also to define ortholog identifiers as an extension of EC numbers.
引用
收藏
页码:4029 / 4036
页数:8
相关论文
共 37 条
  • [1] The complete genome sequence of Escherichia coli K-12
    Blattner, FR
    Plunkett, G
    Bloch, CA
    Perna, NT
    Burland, V
    Riley, M
    ColladoVides, J
    Glasner, JD
    Rode, CK
    Mayhew, GF
    Gregor, J
    Davis, NW
    Kirkpatrick, HA
    Goeden, MA
    Rose, DJ
    Mau, B
    Shao, Y
    [J]. SCIENCE, 1997, 277 (5331) : 1453 - +
  • [2] Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii
    Bult, CJ
    White, O
    Olsen, GJ
    Zhou, LX
    Fleischmann, RD
    Sutton, GG
    Blake, JA
    FitzGerald, LM
    Clayton, RA
    Gocayne, JD
    Kerlavage, AR
    Dougherty, BA
    Tomb, JF
    Adams, MD
    Reich, CI
    Overbeek, R
    Kirkness, EF
    Weinstock, KG
    Merrick, JM
    Glodek, A
    Scott, JL
    Geoghagen, NSM
    Weidman, JF
    Fuhrmann, JL
    Nguyen, D
    Utterback, TR
    Kelley, JM
    Peterson, JD
    Sadow, PW
    Hanna, MC
    Cotton, MD
    Roberts, KM
    Hurst, MA
    Kaine, BP
    Borodovsky, M
    Klenk, HP
    Fraser, CM
    Smith, HO
    Woese, CR
    Venter, JC
    [J]. SCIENCE, 1996, 273 (5278) : 1058 - 1073
  • [3] Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence
    Cole, ST
    Brosch, R
    Parkhill, J
    Garnier, T
    Churcher, C
    Harris, D
    Gordon, SV
    Eiglmeier, K
    Gas, S
    Barry, CE
    Tekaia, F
    Badcock, K
    Basham, D
    Brown, D
    Chillingworth, T
    Connor, R
    Davies, R
    Devlin, K
    Feltwell, T
    Gentles, S
    Hamlin, N
    Holroyd, S
    Hornby, T
    Jagels, K
    Krogh, A
    McLean, J
    Moule, S
    Murphy, L
    Oliver, K
    Osborne, J
    Quail, MA
    Rajandream, MA
    Rogers, J
    Rutter, S
    Seeger, K
    Skelton, J
    Squares, R
    Squares, S
    Sulston, JE
    Taylor, K
    Whitehead, S
    Barrell, BG
    [J]. NATURE, 1998, 393 (6685) : 537 - +
  • [4] Conservation of gene order: a fingerprint of proteins that physically interact
    Dandekar, T
    Snel, B
    Huynen, M
    Bork, P
    [J]. TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (09) : 324 - 328
  • [5] Origin of genes encoding multi-enzymatic proteins in eukaryotes
    Davidson, JN
    Peterson, ML
    [J]. TRENDS IN GENETICS, 1997, 13 (07) : 281 - 285
  • [6] The complete genome of the hyperthermophilic bacterium Aquifex aeolicus
    Deckert, G
    Warren, PV
    Gaasterland, T
    Young, WG
    Lenox, AL
    Graham, DE
    Overbeek, R
    Snead, MA
    Keller, M
    Aujay, M
    Huber, R
    Feldman, RA
    Short, JM
    Olsen, GJ
    Swanson, RV
    [J]. NATURE, 1998, 392 (6674) : 353 - 358
  • [7] WHOLE-GENOME RANDOM SEQUENCING AND ASSEMBLY OF HAEMOPHILUS-INFLUENZAE RD
    FLEISCHMANN, RD
    ADAMS, MD
    WHITE, O
    CLAYTON, RA
    KIRKNESS, EF
    KERLAVAGE, AR
    BULT, CJ
    TOMB, JF
    DOUGHERTY, BA
    MERRICK, JM
    MCKENNEY, K
    SUTTON, G
    FITZHUGH, W
    FIELDS, C
    GOCAYNE, JD
    SCOTT, J
    SHIRLEY, R
    LIU, LI
    GLODEK, A
    KELLEY, JM
    WEIDMAN, JF
    PHILLIPS, CA
    SPRIGGS, T
    HEDBLOM, E
    COTTON, MD
    UTTERBACK, TR
    HANNA, MC
    NGUYEN, DT
    SAUDEK, DM
    BRANDON, RC
    FINE, LD
    FRITCHMAN, JL
    FUHRMANN, JL
    GEOGHAGEN, NSM
    GNEHM, CL
    MCDONALD, LA
    SMALL, KV
    FRASER, CM
    SMITH, HO
    VENTER, JC
    [J]. SCIENCE, 1995, 269 (5223) : 496 - 512
  • [8] THE MINIMAL GENE COMPLEMENT OF MYCOPLASMA-GENITALIUM
    FRASER, CM
    GOCAYNE, JD
    WHITE, O
    ADAMS, MD
    CLAYTON, RA
    FLEISCHMANN, RD
    BULT, CJ
    KERLAVAGE, AR
    SUTTON, G
    KELLEY, JM
    FRITCHMAN, JL
    WEIDMAN, JF
    SMALL, KV
    SANDUSKY, M
    FUHRMANN, J
    NGUYEN, D
    UTTERBACK, TR
    SAUDEK, DM
    PHILLIPS, CA
    MERRICK, JM
    TOMB, JF
    DOUGHERTY, BA
    BOTT, KF
    HU, PC
    LUCIER, TS
    PETERSON, SN
    SMITH, HO
    HUTCHISON, CA
    VENTER, JC
    [J]. SCIENCE, 1995, 270 (5235) : 397 - 403
  • [9] Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi
    Fraser, CM
    Casjens, S
    Huang, WM
    Sutton, GG
    Clayton, R
    Lathigra, R
    White, O
    Ketchum, KA
    Dodson, R
    Hickey, EK
    Gwinn, M
    Dougherty, B
    Tomb, JF
    Fleischmann, RD
    Richardson, D
    Peterson, J
    Kerlavage, AR
    Quackenbush, J
    Salzberg, S
    Hanson, M
    vanVugt, R
    Palmer, N
    Adams, MD
    Gocayne, J
    Weidman, J
    Utterback, T
    Watthey, L
    McDonald, L
    Artiach, P
    Bowman, C
    Garland, S
    Fujii, C
    Cotton, MD
    Horst, K
    Roberts, K
    Hatch, B
    Smith, HO
    Venter, JC
    [J]. NATURE, 1997, 390 (6660) : 580 - 586
  • [10] Complete genome sequence of Treponema pallidum, the syphilis spirochete
    Fraser, CM
    Norris, SJ
    Weinstock, CM
    White, O
    Sutton, GG
    Dodson, R
    Gwinn, M
    Hickey, EK
    Clayton, R
    Ketchum, KA
    Sodergren, E
    Hardham, JM
    McLeod, MP
    Salzberg, S
    Peterson, J
    Khalak, H
    Richardson, D
    Howell, JK
    Chidambaram, M
    Utterback, T
    McDonald, L
    Artiach, P
    Bowman, C
    Cotton, MD
    Fujii, C
    Garland, S
    Hatch, B
    Horst, K
    Roberts, K
    Sandusky, M
    Weidman, J
    Smith, HO
    Venter, JC
    [J]. SCIENCE, 1998, 281 (5375) : 375 - 388