Automatic methods for predicting functionally important residues

被引:173
作者
Mesa, AD [1 ]
Pazos, F [1 ]
Valencia, A [1 ]
机构
[1] Natl Biotechnol Ctr, Prot Design Grp, Madrid 28049, Spain
关键词
functional residue; tree-determinant position; bioinformatics; protein structure; protein function;
D O I
10.1016/S0022-2836(02)01451-1
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Sequence analysis is often the first guide for the prediction of residues in a protein family that may have functional significance. A few methods have been proposed which use the division of protein families into subfamilies in the search for those positions that could have some functional significance for the whole family, but at the same time which exhibit the specificity of each subfamily ("Tree-determinant residues"). However, there are still many unsolved questions like the best division of a protein family into subfamilies, or the accurate detection of sequence variation patterns characteristic of different subfamilies. Here we present a systematic study in a significant number of protein families, testing the statistical meaning of the Tree-determinant residues predicted by three different methods that represent the range of available approaches. The first method takes as a starting point a phylogenetic representation of a protein family and, following the principle of Relative Entropy from Information Theory, automatically searches for the optimal division of the family into subfamilies. The second method looks for positions whose mutational behavior is reminiscent of the mutational behavior of the full-length proteins, by directly comparing the corresponding distance matrices. The third method is an automation of the analysis of distribution of sequences and amino acid positions in the corresponding multidimensional spaces using a vector-based principal component analysis. These three methods have been tested on two non-redundant lists of protein families: one composed by proteins that bind a variety of ligand groups, and the other composed by proteins with annotated functionally relevant sites. In most cases, the residues predicted by the three methods show a clear tendency to be close to bound ligands of biological relevance and to those amino acids described as participants in key aspects of protein function. These three automatic methods provide a wide range of possibilities for biologists to analyze their families of interest, in a similar way to the one presented here for the family of proteins related with ras-p21. (C) 2003 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:1289 / 1302
页数:14
相关论文
共 29 条
[21]  
PAZOS F, 1997, BIOCOMPUTING EMERGEN, P132
[22]   A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites: Application to the evolution of five gene families [J].
Pupko, T ;
Pe'er, I ;
Hasegawa, M ;
Graur, D ;
Friedman, N .
BIOINFORMATICS, 2002, 18 (08) :1116-1123
[23]   THE HSSP DATA-BASE OF PROTEIN STRUCTURE-SEQUENCE ALIGNMENTS [J].
SANDER, C ;
SCHNEIDER, R .
NUCLEIC ACIDS RESEARCH, 1993, 21 (13) :3105-3109
[24]   DISTINCT STRUCTURAL ELEMENTS OF RAB5 DEFINE ITS FUNCTIONAL SPECIFICITY [J].
STENMARK, H ;
VALENCIA, A ;
MARTINEZ, O ;
ULLRICH, O ;
GOUD, B ;
ZERIAL, M .
EMBO JOURNAL, 1994, 13 (03) :575-583
[25]   NOVEL DNA-BINDING MOTIFS IN THE DNA-REPAIR ENZYME ENDONUCLEASE-III CRYSTAL-STRUCTURE [J].
THAYER, MM ;
AHERN, H ;
XING, DX ;
CUNNINGHAM, RP ;
TAINER, JA .
EMBO JOURNAL, 1995, 14 (16) :4108-4120
[26]  
Vetterling W. T, 2002, NUMERICAL RECIPES C
[27]  
Weaver W., 1963, MATH THEORY COMMUNIC
[28]   From fold predictions to function predictions: Automation of functional site conservation analysis for functional genome predictions [J].
Zhang, BH ;
Rychlewski, L ;
Pawlowski, K ;
Fetrow, JS ;
Skolnick, J ;
Godzik, A .
PROTEIN SCIENCE, 1999, 8 (05) :1104-1115
[29]  
ZUCKERKANDL EMILE, 1965, P97