In silico structural and functional analysis of the human cytomegalovirus (HHV5) genome

被引:38
作者
Novotny, J
Rigoutsos, I
Coleman, D
Shenk, T
机构
[1] Victor Chang Cardiac Res Inst, Darlinghurst, NSW 2010, Australia
[2] Princeton Univ, Dept Mol Biol, Princeton, NJ 08544 USA
[3] IBM Corp, Div Res, TJ Watson Res Ctr, Bioinformat & Pattern Discovery Res Grp, Yorktown Hts, NY 10598 USA
关键词
threading; cytomegalovirus; structural genomics; protein folds; stereochemical code;
D O I
10.1006/jmbi.2001.4798
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The open reading frames of human cytomegalovirus (human herpesvirus-5, HHV5) encode some 213 unique proteins with mostly unknown functions. Using the threading program, ProCeryon, we calculated possible matches between the amino acid sequences of these proteins and the Protein Data Bank library of three-dimensional structures. Thirty-six proteins were fully identified in terms of their structure and, often, function; 65 proteins were recognized as members of narrow structural/functional families (e.g. DNA-binding factors, cytokines, enzymes, signaling particles, cell surface receptors etc.); and 87 proteins were assigned to broad structural classes (e.g. all-beta, 3-layer-alpha beta alpha, multidomain, etc.). Genes encoding proteins with similar folds, or containing identical structural traits (extreme sequence length, runs of unstructured (Pro and/or Gly-rich) residues, transmembrane segments, etc.) often formed tandem clusters throughout the genome. In the course of this work, benchmarks on about 20 known folds were used to optimize adjustable parameters of threading calculations, i.e. gap penalty weights used in sequence/structure alignments; new scores obtained as simple combinations of existing scoring functions; and number of threading runs conducive to meaningful results. An introduction of summed, per-residue-normalized scores has been essential for discovery of subdomains (EGF-like, SH2, SH3) in longer protein sequences, such as the eight "open sandwich" cytokine domains, 60-70 amino acids long and having the 3 beta1 alpha fold with one or two disulfide bridges, present in otherwise unrelated proteins. (C) 2001 Academic Press.
引用
收藏
页码:1151 / 1166
页数:16
相关论文
共 34 条
[1]  
ALFORD CA, 1996, FIELDS VIROLOGY, P1981
[2]   DETERMINANTS OF A PROTEIN FOLD - UNIQUE FEATURES OF THE GLOBIN AMINO-ACID-SEQUENCES [J].
BASHFORD, D ;
CHOTHIA, C ;
LESK, AM .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 196 (01) :199-216
[3]  
BENHABILES N, 2001, P 13 EUR S QUANT STR, P99
[4]   Human cytomegalovirus clinical isolates carry at least 19 genes not found in laboratory strains [J].
Cha, TA ;
Tom, E ;
Kemble, GW ;
Duke, GM ;
Mocarski, ES ;
Spaete, RR .
JOURNAL OF VIROLOGY, 1996, 70 (01) :78-83
[5]  
CHEE MS, 1990, CURR TOP MICROBIOL, V154, P125
[6]   DOMINANT FORCES IN PROTEIN FOLDING [J].
DILL, KA .
BIOCHEMISTRY, 1990, 29 (31) :7133-7155
[7]   Structure-based evaluation of sequence comparison and fold recognition alignment accuracy [J].
Domingues, FS ;
Lackner, P ;
Andreeva, A ;
Sippl, MJ .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 297 (04) :1003-1013
[8]   ROLE OF AMINO-ACID CODE AND OF SELECTION FOR CONFORMATION IN EVOLUTION OF PROTEINS [J].
EPSTEIN, CJ .
NATURE, 1966, 210 (5031) :25-&
[9]   Predicting structures for genome proteins [J].
Fischer, D ;
Eisenberg, D .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1999, 9 (02) :208-211
[10]  
FISHER D, 1997, P NATL ACAD SCI USA, V96, P11285