Structural clusters of evolutionary trace residues are statistically significant and common in proteins

被引:159
作者
Madabushi, S
Yao, H
Marsh, M
Kristensen, DM
Philippi, A
Sowa, ME
Lichtarge, O
机构
[1] Baylor Coll Med, Struct & Computat Biol & Mol Biophys Program, Houston, TX 77030 USA
[2] Baylor Coll Med, Dept Mol & Human Genet, Houston, TX 77030 USA
[3] Baylor Coll Med, Dept Biochem & Mol Biol, Houston, TX 77030 USA
基金
美国国家科学基金会;
关键词
bioinformatics; structural genomics; protein interaction; active site evolution; ligand binding;
D O I
10.1006/jmbi.2001.5327
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Given the massive increase in the number of new sequences and structures, a critical problem is how to integrate these raw data into meaningful biological information. One approach, the Evolutionary Trace, or ET, uses phylogenetic information to rank the residues in a protein sequence by evolutionary importance and then maps those ranked at the top onto a representative structure. If these residues form structural clusters, they can identify functional surfaces such as those involved in molecular recognition. Now that a number of examples have shown that ET can identify binding sites and focus mutational studies on their relevant functional determinants, we ask whether the method can be improved so as to be applicable on a large scale. To address this question, we introduce a new treatment of gaps resulting from insertions and deletions, which streamlines the selection of sequences used as input. We also introduce objective statistics to assess the significance of the total number of clusters and of the size of the largest one. As a result of the novel treatment of gaps, ET performance improves measurably. We find evolutionarily privileged clusters that are significant at the 5% level in 45 out of 46 (98%) proteins drawn from a variety of structural classes and biological functions. In 37 of the 38 proteins for which a protein-ligand complex is available, the dominant cluster contacts the ligand. We conclude that spatial clustering of evolutionarily important residues is a general phenomenon, consistent with the cooperative nature of residues that determine structure and function. In practice, these results suggest that ET can be applied on a large scale to identify functional sites in a significant fraction of the structures in the protein databank (PDB). This approach to combining raw sequences and structure to obtain detailed insights into the molecular basis of function should prove valuable in the context of the Structural Genomics Initiative. (C) 2002 Elsevier Science Ltd.
引用
收藏
页码:139 / 154
页数:16
相关论文
共 31 条
[1]   Automated structure-based prediction of functional sites in proteins: Applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking [J].
Aloy, P ;
Querol, E ;
Aviles, FX ;
Sternberg, MJE .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 311 (02) :395-408
[2]   ConSurf: An algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information [J].
Armon, A ;
Graur, D ;
Ben-Tal, N .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 307 (01) :447-463
[3]   C5a receptor activation - Genetic identification of critical residues in four transmembrane helices [J].
Baranski, TJ ;
Herzmark, P ;
Lichtarge, O ;
Gerber, BO ;
Trueheart, J ;
Meng, EC ;
Iiri, T ;
Sheikh, SP ;
Bourne, HR .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1999, 274 (22) :15757-15765
[4]   A tour of structural genomics [J].
Brenner, SE .
NATURE REVIEWS GENETICS, 2001, 2 (10) :801-809
[5]   A METHOD TO PREDICT FUNCTIONAL RESIDUES IN PROTEINS [J].
CASARI, G ;
SANDER, C ;
VALENCIA, A .
NATURE STRUCTURAL BIOLOGY, 1995, 2 (02) :171-178
[6]   AUTOMATED ASSEMBLY OF PROTEIN BLOCKS FOR DATABASE SEARCHING [J].
HENIKOFF, S ;
HENIKOFF, JG .
NUCLEIC ACIDS RESEARCH, 1991, 19 (23) :6565-6572
[7]   CLASSICAL ELECTROSTATICS IN BIOLOGY AND CHEMISTRY [J].
HONIG, B ;
NICHOLLS, A .
SCIENCE, 1995, 268 (5214) :1144-1149
[8]   Evolutionary trace analysis of TGF-β and related growth factors:: implications for site-directed mutagenesis [J].
Innis, CA ;
Shi, JY ;
Blundell, TL .
PROTEIN ENGINEERING, 2000, 13 (12) :839-847
[9]   Prediction of protein-protein interaction sites using patch analysis [J].
Jones, S ;
Thornton, JM .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 272 (01) :133-143
[10]   A GEOMETRIC APPROACH TO MACROMOLECULE-LIGAND INTERACTIONS [J].
KUNTZ, ID ;
BLANEY, JM ;
OATLEY, SJ ;
LANGRIDGE, R ;
FERRIN, TE .
JOURNAL OF MOLECULAR BIOLOGY, 1982, 161 (02) :269-288