Comparative genomics using data mining tools

被引:8
作者
Nandi, T [1 ]
B-Rao, C [1 ]
Ramachandran, S [1 ]
机构
[1] Ctr Biochem Technol, Funct Genom Unit, Delhi 110007, India
关键词
comparative genomics; compositional analysis; data mining; sequence complexity;
D O I
10.1007/BF02703680
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We have analysed the genomes of representatives of three kingdoms of life, namely, archaea, eubacteria and eukaryota using data mining tools based on compositional analyses of the protein sequences. The representatives chosen in this analysis were Methanococcus jannaschii, Haemophilus influenzae and Saccharomyces cerevisiae. We have identified the common and different features between the three genomes in the protein evolution patterns. M. jannaschii has been seen to have a greater number of proteins with more charged amino acids whereas S, cerevisiae has been observed to have a greater number of hydrophilic proteins. Despite the differences in intrinsic compositional characteristics between the proteins from the different genomes we have also identified certain common characteristics. We have carried out exploratory Principal Component Analysis of the multivariate data on the proteins of each organism in an effort to classify the proteins into clusters. Interestingly, we found that most of the proteins in cacti organism cluster closely together, but there are a few, 'outliers'. We focus on the outliers for the functional investigations, which may aid in revealing any unique features of the biology of the respective organisms.
引用
收藏
页码:15 / 25
页数:11
相关论文
共 21 条
[1]   Functional classes in the three domains of life [J].
Andrade, MA ;
Ouzounis, C ;
Sander, C ;
Tamames, J ;
Valencia, A .
JOURNAL OF MOLECULAR EVOLUTION, 1999, 49 (05) :551-557
[2]  
Casari G, 1995, STRUCT BIOL, V2, P171
[3]  
FAUCHERE JL, 1983, EUR J MED CHEM, V18, P369
[4]   Near-total completion gastrectomy for severe postvagotomy gastric stasis: Analysis of early and long-term results in 62 patients [J].
Forstner-Barthell, AW ;
Murr, MM ;
Nitecki, S ;
Camilleri, M ;
Prather, CM ;
Kelly, KA ;
Sarr, MG .
JOURNAL OF GASTROINTESTINAL SURGERY, 1999, 3 (01) :15-21
[5]   THE MINIMAL GENE COMPLEMENT OF MYCOPLASMA-GENITALIUM [J].
FRASER, CM ;
GOCAYNE, JD ;
WHITE, O ;
ADAMS, MD ;
CLAYTON, RA ;
FLEISCHMANN, RD ;
BULT, CJ ;
KERLAVAGE, AR ;
SUTTON, G ;
KELLEY, JM ;
FRITCHMAN, JL ;
WEIDMAN, JF ;
SMALL, KV ;
SANDUSKY, M ;
FUHRMANN, J ;
NGUYEN, D ;
UTTERBACK, TR ;
SAUDEK, DM ;
PHILLIPS, CA ;
MERRICK, JM ;
TOMB, JF ;
DOUGHERTY, BA ;
BOTT, KF ;
HU, PC ;
LUCIER, TS ;
PETERSON, SN ;
SMITH, HO ;
HUTCHISON, CA ;
VENTER, JC .
SCIENCE, 1995, 270 (5235) :397-403
[6]   Prediction of transcription regulatory sites in Archaea by a comparative genomic approach [J].
Gelfand, MS ;
Koonin, EV ;
Mironov, AA .
NUCLEIC ACIDS RESEARCH, 2000, 28 (03) :695-705
[7]  
GRIBSKOV M, 1992, SEQUENCE ANAL PRIMER, P67
[8]   Global transposon mutagenesis and a minimal mycoplasma genome [J].
Hutchison, CA ;
Peterson, SN ;
Gill, SR ;
Cline, RT ;
White, O ;
Fraser, CM ;
Smith, HO ;
Venter, JC .
SCIENCE, 1999, 286 (5447) :2165-2169
[9]   Beyond complete genomes: from sequence to structure and function [J].
Koonin, EV ;
Tatusov, RL ;
Galperin, MY .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1998, 8 (03) :355-363
[10]   A minimal gene set for cellular life derived by comparison of complete bacterial genomes [J].
Mushegian, AR ;
Koonin, EV .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (19) :10268-10273