The selection of acceptable protein mutations

被引:26
作者
Sasidharan, Rajkumar
Chothia, Cyrus
机构
[1] MRC, Mol Biol Lab, Cambridge CB2 2QH, England
[2] Yale Univ, Dept Biochem & Mol Biophys, New Haven, CT 06520 USA
基金
英国医学研究理事会;
关键词
codon frequencies; distribution of mutations in protein structure; sequence-1 structure divergence;
D O I
10.1073/pnas.0703737104
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We have determined the general constraints that govern sequence divergence in proteins that retain entirely, or very largely, the same structure and function. To do this we collected data from three different groups of orthologous sequences: those found in humans and mice, in humans and chickens, and in Escherichia coli and Salmonella enterica. In total, these organisms have 21,738 suitable pairs of orthologs, and these contain nearly 2 million mutations. The three groups differ greatly in the taxa from which they come and/or in the time that separates them from their last common ancestor. Nevertheless, the results we obtain from the three different groups are strikingly similar. For each group, the orthologous sequence pairs were assigned to six different divergence categories on the basis of their sequence identities. For categories with the same divergence, common accepted mutations have similar frequencies and rank orders in the three groups. With divergence, the width of the range of common mutations grows in the same manner in each group. We examined the distribution of mutations in protein structures. With increasing divergence, mutations increase at different rates in the buried, intermediate, and exposed regions of protein structures in a manner that explains the exponential relationship between the divergence of structure and sequence. This work implies that commonly allowed mutations are selected by a set of general constraints that are well defined and whose nature varies with divergence.
引用
收藏
页码:10080 / 10085
页数:6
相关论文
共 27 条
[1]  
Akashi H, 2003, GENETICS, V164, P1291
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
[Anonymous], 1978, Atlas of protein sequence and structure
[4]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[5]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[6]   THE RELATION BETWEEN THE DIVERGENCE OF SEQUENCE AND STRUCTURE IN PROTEINS [J].
CHOTHIA, C ;
LESK, AM .
EMBO JOURNAL, 1986, 5 (04) :823-826
[7]   Structural determinants in the sequences of immunoglobulin variable domain [J].
Chothia, C ;
Gelfand, I ;
Kister, A .
JOURNAL OF MOLECULAR BIOLOGY, 1998, 278 (02) :457-479
[8]   Determining divergence times with a protein clock: Update and reevaluation [J].
Feng, DF ;
Cho, G ;
Doolittle, RF .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (24) :13028-13033
[9]   COMPARISON OF CONFORMATIONAL CHARACTERISTICS IN STRUCTURALLY SIMILAR PROTEIN PAIRS [J].
FLORES, TP ;
ORENGO, CA ;
MOSS, DS ;
THORNTON, JM .
PROTEIN SCIENCE, 1993, 2 (11) :1811-1826
[10]   CODON CATALOG USAGE AND THE GENOME HYPOTHESIS [J].
GRANTHAM, R ;
GAUTIER, C ;
GOUY, M ;
MERCIER, R ;
PAVE, A .
NUCLEIC ACIDS RESEARCH, 1980, 8 (01) :R49-R62