Site interdependence attributed to tertiary structure in amino acid sequence evolution

被引:71
作者
Rodrigue, N
Lartillot, N
Bryant, D
Philippe, H
机构
[1] Univ Montreal, Canadian Inst Adv Res, Dept Biochim, Montreal, PQ H3C 3J7, Canada
[2] Lab Informat Robot & Microelect Montpellier, Montpellier, France
[3] McGill Univ, McGill Ctr Bioinformat, Montreal, PQ, Canada
关键词
protein evolution; phylogenetics; Bayesian Markov chain Monte Carlo; statistical potentials;
D O I
10.1016/j.gene.2004.12.011
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Standard likelihood-based frameworks in phylogenetics consider the process of evolution of a sequence site by site. Assuming that sites evolve independently greatly simplifies the required calculations. However, this simplification is known to be incorrect in many cases. Here, a computational method that allows for general dependence between sites of a sequence is investigated. Using this method, measures acting as sequence fitness proxies can be considered over a phylogenetic tree. In this work, a set of statistically derived amino acid pairwise potentials, developed in the context of protein threading, is used to account for what we call the structural fitness of a sequence. We describe a model combining statistical potentials with an empirical amino acid substitution matrix. We propose such a combination as a useful way of capturing the complexity of protein evolution. Finally, we outline features of the model using three datasets and show the approach's sensitivity to different tree topologies. (c) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:207 / 217
页数:11
相关论文
共 36 条
[1]  
[Anonymous], 2012, Probability Theory: The Logic Of Science
[2]   Connectivity of neutral networks, overdispersion, and structural conservation in protein evolution [J].
Bastolla, U ;
Porto, M ;
Roman, HE ;
Vendruscolo, M .
JOURNAL OF MOLECULAR EVOLUTION, 2003, 56 (03) :243-254
[3]   How to guarantee optimal stability for most representative structures in the protein data bank [J].
Bastolla, U ;
Farwer, J ;
Knapp, EW ;
Vendruscolo, M .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2001, 44 (02) :79-96
[4]   Modeling residue usage in aligned protein sequences via maximum likelihood [J].
Bruno, WJ .
MOLECULAR BIOLOGY AND EVOLUTION, 1996, 13 (10) :1368-1374
[5]  
Dayhoff M.O., 1978, ATLAS PROTEIN SEQ ST, V5
[6]   A hidden Markov Model approach to variation among sites in rate of evolution [J].
Felsenstein, J ;
Churchill, GA .
MOLECULAR BIOLOGY AND EVOLUTION, 1996, 13 (01) :93-104
[7]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[8]  
Goldman N, 1998, GENETICS, V149, P445
[9]   A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood [J].
Guindon, S ;
Gascuel, O .
SYSTEMATIC BIOLOGY, 2003, 52 (05) :696-704
[10]   Evolutionary distances for protein-coding sequences: Modeling site-specific residue frequencies [J].
Halpern, AL ;
Bruno, WJ .
MOLECULAR BIOLOGY AND EVOLUTION, 1998, 15 (07) :910-917