Species-specific protein sequence and fold optimizations

被引:9
作者
Dumontier, M
Michalickova, K
Hogue, CWV [1 ]
机构
[1] Univ Toronto, Dept Biochem, Toronto, ON M5S 1A8, Canada
[2] Mt Sinai Hosp, Samuel Lunenfeld Res Inst, Toronto, ON M5G 1X5, Canada
关键词
D O I
10.1186/1471-2105-3-39
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: An organism's ability to adapt to its particular environmental niche is of fundamental importance to its survival and proliferation. In the largest study of its kind, we sought to identify and exploit the amino-acid signatures that make species-specific protein adaptation possible across 100 complete genomes. Results: Environmental niche was determined to be a significant factor in variability from correspondence analysis using the amino acid composition of over 360,000 predicted open reading frames (ORFs) from 17 archae, 76 bacteria and 7 eukaryote complete genomes. Additionally, we found clusters of phylogenetically unrelated archae and bacteria that share similar environments by amino acid composition clustering. Composition analyses of conservative, domain-based homology modeling suggested an enrichment of small hydrophobic residues Ala, Gly, Val and charged residues Asp, Glu, His and Arg across all genomes. However, larger aromatic residues Phe, Trip and Tyr are reduced in folds, and these results were not affected by low complexity biases. We derived two simple log-odds scoring functions from ORFs (C-G) and folds (C-F) for each of the complete genomes. C-F achieved an average cross-validation success rate of 85+/-8% whereas the C-G detected 73+/-9% species-specific sequences when competing against all other non-redundant C-G. Continuously updated results are available at [http://genome.mshri.on.ca]. Conclusion: Our analysis of amino acid compositions from the complete genomes provides stronger evidence for species-specific and environmental residue preferences in genomic sequences as well as in folds. Scoring functions derived from this work will be useful in future protein engineering experiments and possibly in identifying horizontal transfer events.
引用
收藏
页数:15
相关论文
共 58 条
[1]   THE MECHANISM OF IRREVERSIBLE ENZYME INACTIVATION AT 100-DEGREES-C [J].
AHERN, TJ ;
KLIBANOV, AM .
SCIENCE, 1985, 228 (4705) :1280-1284
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   Breaking through the acid barrier: An orchestrated response to proton stress by enteric bacteria [J].
Audia, JP ;
Webb, CC ;
Foster, JW .
INTERNATIONAL JOURNAL OF MEDICAL MICROBIOLOGY, 2001, 291 (02) :97-106
[4]   Molecular evolution of protein atomic composition [J].
Baudouin-Cornu, P ;
Surdin-Kerjan, Y ;
Marlière, P ;
Thomas, D .
SCIENCE, 2001, 293 (5528) :297-300
[5]   Purification and characterization of two extremely thermostable enzymes, phosphate acetyltransferase and acetate kinase, from the hyperthermophilic Eubacterium Thermotoga maritima [J].
Bock, AK ;
Glasemacher, J ;
Schmidt, R ;
Schönheit, P .
JOURNAL OF BACTERIOLOGY, 1999, 181 (06) :1861-1867
[6]   THE FREQUENCY OF ION-PAIR SUBSTRUCTURES IN PROTEINS IS QUANTITATIVELY RELATED TO ELECTROSTATIC POTENTIAL - A STATISTICAL-MODEL FOR NONBONDED INTERACTIONS [J].
BRYANT, SH ;
LAWRENCE, CE .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 1991, 9 (02) :108-119
[7]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[8]   Support vector machines for prediction of protein subcellular location by incorporating quasi-sequence-order effect [J].
Cai, YD ;
Liu, XJ ;
Xu, XB ;
Chou, KC .
JOURNAL OF CELLULAR BIOCHEMISTRY, 2002, 84 (02) :343-348
[9]   Elucidation of determinants of protein stability through genome sequence analysis [J].
Chakravarty, S ;
Varadarajan, R .
FEBS LETTERS, 2000, 470 (01) :65-69
[10]   Elucidation of factors responsible for enhanced thermal stability of proteins: A structural genomics based study [J].
Chakravarty, S ;
Varadarajan, R .
BIOCHEMISTRY, 2002, 41 (25) :8152-8161