Coevolving protein residues: Maximum likelihood identification and relationship to structure

被引:182
作者
Pollock, DD
Taylor, WR
Goldman, N
机构
[1] Univ Cambridge, Dept Genet, Cambridge CB2 3EH, England
[2] Natl Inst Med Res, Div Math Biol, London NW7 1AA, England
基金
英国惠康基金;
关键词
coevolution; protein residues; protein structure; maximum likelihood; molecular evolution;
D O I
10.1006/jmbi.1998.2601
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The identification of protein sites undergoing correlated evolution (coevolution) is of great interest due to the possibility that these pairs will tend to be adjacent in the three-dimensional structure. Identification of such pairs should provide useful information for understanding the evolutionary process, predicting the effects of site-directed substitution, and potentially for predicting protein structure. Here, we develop and apply a maximum likelihood method with the aim of improving detection of coevolution. Unlike previous methods which have had limited success, this method allows for correlations induced by phylogenetic relationships and for variation in rate of evolution along branches, and does not rely on accurate reconstruction of ancestral nodes. Tn order to reduce the complexity of coevolutionary relationships and identify the primary component of pairwise coevolution between two sites, we reduce the data to a two-state system;at each site, regardless of the actual number of residues observed at that site. Simulations show that this strategy is good at identifying simple correlations and at recognizing cases in which the data are insufficient to distinguish between coevolution and spurious correlations. The new method was tested by using size and charge characteristics to group the residues at each site, and then evaluating coevolution in myoglobin sequences. Grouping based on physicochemical characteristics allows categorization of coevolving sites into positive and negative coevolution, depending on the correlation between equilibrium state frequencies. We detected a striking excess of negative coevolution (corresponding to charge) at sites brought into proximity by the periodicity of the alpha-helix, and there was also a tendency for sites with significant likelihood ratios to be close in the three-dimensional structure. Sites on the surface of the protein appear to coevolve both when they are close in the structure, and when they are distant, implying a role for folding and/or avoidance of quaternary structure in the coevolution process. (C) 1999 Academic Press.
引用
收藏
页码:187 / 198
页数:12
相关论文
共 27 条
[1]  
ALTSCHUH D, 1987, J MOL BIOL, V193, P643
[2]   Distance geometry based comparative modelling [J].
Aszodi, A ;
Munro, REJ ;
Taylor, WR .
FOLDING & DESIGN, 1997, 2 (03) :S3-S6
[3]   The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :38-42
[4]  
Benner Steven A., 1996, P71
[5]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[6]   An analysis of simultaneous variation in protein structures [J].
Chelvanayagam, G ;
Eggenschwiler, A ;
Knecht, L ;
Gonnet, GH ;
Benner, SA .
PROTEIN ENGINEERING, 1997, 10 (04) :307-316
[7]  
COX DR, 1962, J ROY STAT SOC B, V24, P406
[8]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[9]  
Felsenstein J, 1993, PHYLIP (Phylogeny Inference Package) version 3.5c
[10]   CORRELATED MUTATIONS AND RESIDUE CONTACTS IN PROTEINS [J].
GOBEL, U ;
SANDER, C ;
SCHNEIDER, R ;
VALENCIA, A .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1994, 18 (04) :309-317