Sequence variation in ligand binding sites in proteins

被引:66
作者
Magliery, TJ
Regan, L
机构
[1] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
[2] Yale Univ, Dept Chem, New Haven, CT USA
关键词
D O I
10.1186/1471-2105-6-240
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The recent explosion in the availability of complete genome sequences has led to the cataloging of tens of thousands of new proteins and putative proteins. Many of these proteins can be structurally or functionally categorized from sequence conservation alone. In contrast, little attention has been given to the meaning of poorly-conserved sites in families of proteins, which are typically assumed to be of little structural or functional importance. Results: Recently, using statistical free energy analysis of tetratricopeptide repeat (TPR) domains, we observed that positions in contact with peptide ligands are more variable than surface positions in general. Here we show that statistical analysis of TPRs, ankyrin repeats, Cys(2)His(2) zinc fingers and PDZ domains accurately identifies specificity-determining positions by their sequence variation. Sequence variation is measured as deviation from a neutral reference state, and we present probabilistic and information theory formalisms that improve upon recently suggested methods such as statistical free energies and sequence entropies. Conclusion: Sequence variation has been used to identify functionally-important residues in four selected protein families. With TPRs and ankyrin repeats, protein families that bind highly diverse ligands, the effect is so pronounced that sequence "hypervariation" alone can be used to predict ligand binding sites.
引用
收藏
页数:11
相关论文
共 45 条
[1]   DETERMINANTS OF A PROTEIN FOLD - UNIQUE FEATURES OF THE GLOBIN AMINO-ACID-SEQUENCES [J].
BASHFORD, D ;
CHOTHIA, C ;
LESK, AM .
JOURNAL OF MOLECULAR BIOLOGY, 1987, 196 (01) :199-216
[2]   The structure of GABPα/β:: An ETS domain ankyrin repeat heterodimer bound to DNA [J].
Batchelor, AH ;
Piper, DE ;
de la Brousse, FC ;
McKnight, SL ;
Wolberger, C .
SCIENCE, 1998, 279 (5353) :1037-1041
[3]   ConSeq: the identification of functionally and structurally important residues in protein sequences [J].
Berezin, C ;
Glaser, F ;
Rosenberg, J ;
Paz, I ;
Pupko, T ;
Fariselli, P ;
Casadio, R ;
Ben-Tal, N .
BIOINFORMATICS, 2004, 20 (08) :1322-1324
[4]   High-affinity binders selected from designed ankyrin repeat protein libraries [J].
Binz, HK ;
Amstutz, P ;
Kohl, A ;
Stumpp, MT ;
Briand, C ;
Forrer, P ;
Grütter, MG ;
Plückthun, A .
NATURE BIOTECHNOLOGY, 2004, 22 (05) :575-582
[5]   Designing repeat proteins:: Well-expressed, soluble and stable proteins from combinatorial libraries of consensus ankyrin repeat proteins [J].
Binz, HK ;
Stumpp, MT ;
Forrer, P ;
Amstutz, P ;
Plückthun, A .
JOURNAL OF MOLECULAR BIOLOGY, 2003, 332 (02) :489-503
[6]   Protein sequence motifs [J].
Bork, P ;
Koonin, EV .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1996, 6 (03) :366-376
[7]   DECIPHERING THE MESSAGE IN PROTEIN SEQUENCES - TOLERANCE TO AMINO-ACID SUBSTITUTIONS [J].
BOWIE, JU ;
REIDHAAROLSON, JF ;
LIM, WA ;
SAUER, RT .
SCIENCE, 1990, 247 (4948) :1306-1310
[8]   THE RELATION BETWEEN THE DIVERGENCE OF SEQUENCE AND STRUCTURE IN PROTEINS [J].
CHOTHIA, C ;
LESK, AM .
EMBO JOURNAL, 1986, 5 (04) :823-826
[9]   TPR proteins: the versatile helix [J].
D'Andrea, LD ;
Regan, L .
TRENDS IN BIOCHEMICAL SCIENCES, 2003, 28 (12) :655-662
[10]   A perturbation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments [J].
Dekker, JP ;
Fodor, A ;
Aldrich, RW ;
Yellen, G .
BIOINFORMATICS, 2004, 20 (10) :1565-1572