Insights from modeling protein evolution with context-dependent mutation and asymmetric amino acid selection

被引:4
作者
Saunders, Christopher T. [1 ]
Green, Phil [1 ,2 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] Howard Hughes Med Inst, Seattle, WA USA
关键词
protein evolution; amino acid selection; context-dependent mutation; phylogenetic analysis; protein expression; protein structure;
D O I
10.1093/molbev/msm190
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We develop an approximate maximum likelihood method to estimate flanking nucleotide context-dependent mutation rates and amino acid exchange-dependent selection in orthologous protein-coding sequences and use it to analyze genome-wide coding sequence alignments from mammals and yeast. Allowing context-dependent mutation provides a better fit to coding sequence data than simpler (context-independent or CpG "hotspot") models and significantly affects selection parameter estimates. Allowing asymmetric (nonreciprocal) selection on amino acid exchanges gives a better fit than simple dN/dS or symmetric selection models. Relative selection strength estimates from our models show good agreement with independent estimates derived from human disease-causing and engineered mutations. Selection strengths depend on local protein structure, showing expected biophysical trends in helical versus nonhelical regions and increased asymmetry on polar-hydrophobic exchanges with increased burial. The more stringent selection that has previously been observed for highly expressed proteins is primarily concentrated in buried regions, supporting the notion that such proteins are under stronger than average selection for stability. Our analyses indicate that a highly parameterized model of mutation and selection is computationally tractable and is a useful tool for exploring a variety of biological questions concerning protein and coding sequence evolution.
引用
收藏
页码:2632 / 2647
页数:16
相关论文
共 69 条
[41]   An empirical codon model for protein sequence evolution [J].
Kosiol, Carolin ;
Holmes, Ian ;
Goldman, Nick .
MOLECULAR BIOLOGY AND EVOLUTION, 2007, 24 (07) :1464-1479
[42]   Initial sequencing and analysis of the human genome [J].
Lander, ES ;
Int Human Genome Sequencing Consortium ;
Linton, LM ;
Birren, B ;
Nusbaum, C ;
Zody, MC ;
Baldwin, J ;
Devon, K ;
Dewar, K ;
Doyle, M ;
FitzHugh, W ;
Funke, R ;
Gage, D ;
Harris, K ;
Heaford, A ;
Howland, J ;
Kann, L ;
Lehoczky, J ;
LeVine, R ;
McEwan, P ;
McKernan, K ;
Meldrim, J ;
Mesirov, JP ;
Miranda, C ;
Morris, W ;
Naylor, J ;
Raymond, C ;
Rosetti, M ;
Santos, R ;
Sheridan, A ;
Sougnez, C ;
Stange-Thomann, N ;
Stojanovic, N ;
Subramanian, A ;
Wyman, D ;
Rogers, J ;
Sulston, J ;
Ainscough, R ;
Beck, S ;
Bentley, D ;
Burton, J ;
Clee, C ;
Carter, N ;
Coulson, A ;
Deadman, R ;
Deloukas, P ;
Dunham, A ;
Dunham, I ;
Durbin, R ;
French, L .
NATURE, 2001, 409 (6822) :860-921
[43]   A nucleotide substitution model with nearest-neighbour interactions [J].
Lunter, Gerton ;
Hein, Jotun .
BIOINFORMATICS, 2004, 20 :216-223
[44]   Recombination drives the evolution of GC-content in the human genome [J].
Meunier, J ;
Duret, L .
MOLECULAR BIOLOGY AND EVOLUTION, 2004, 21 (06) :984-990
[45]   The influence of specific neighboring bases on substitution bias in noncoding regions of the plant chloroplast genome [J].
Morton, BR ;
Oberholzer, VM ;
Clegg, MT .
JOURNAL OF MOLECULAR EVOLUTION, 1997, 45 (03) :227-231
[46]  
MUSE SV, 1994, MOL BIOL EVOL, V11, P715
[47]  
Pál C, 2001, GENETICS, V158, P927
[48]   Column sorting: Rapid calculation of the phylogenetic likelihood function [J].
Pond, SLK ;
Muse, SV .
SYSTEMATIC BIOLOGY, 2004, 53 (05) :685-692
[49]  
Press W. H., 1992, NUMERICAL RECIPES C, V2nd ed., P994
[50]   NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins [J].
Pruitt, KD ;
Tatusova, T ;
Maglott, DR .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D501-D504