Towards realistic codon models: among site variability and dependency of synonymous and non-synonymous rates

被引:51
作者
Mayrose, Itay [1 ]
Doron-Faigenboim, Adi [1 ]
Bacharach, Eran [1 ]
Pupko, Tal [1 ]
机构
[1] Tel Aviv Univ, George S Wise Fac Life Sci, Dept Cell Res & Immunol, IL-69978 Tel Aviv, Israel
关键词
D O I
10.1093/bioinformatics/btm176
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Codon evolutionary models are widely used to infer the selection forces acting on a protein. The non-synonymous to synonymous rate ratio ( denoted by Ka/Ks) is used to infer specific positions that are under purifying or positive selection. Current evolutionary models usually assume that only the non-synonymous rates vary among sites while the synonymous substitution rates are constant. This assumption ignores the possibility of selection forces acting at the DNA or mRNA levels. Towards a more realistic description of sequence evolution, we present a model that accounts for among-site-variation of both synonymous and non-synonymous substitution rates. Furthermore, we alleviate the widespread assumption that positions evolve independently of each other. Thus, possible sources of bias caused by random fluctuations in either the synonymous or non-synonymous rate estimations at a single site is removed. Our model is based on two hidden Markov models that operate on the spatial dimension: one describes the dependency between adjacent non-synonymous rates while the other describes the dependency between adjacent synonymous rates. The presented model is applied to study the selection pressure across the HIV-1 genome. The new model better describes the evolution of all HIV-1 genes, as compared to current codon models. Using both simulations and real data analyses, we illustrate that accounting for synonymous rate variability and dependency greatly increases the accuracy of Ka/Ks estimation and in particular of positively selected sites. Finally, we discuss the applicability of the developed model to infer the selection forces in regulatory and overlapping regions of the HIV-1 genome.
引用
收藏
页码:I319 / I327
页数:9
相关论文
共 35 条
[1]   Accuracy and power of Bayes prediction of amino acid sites under positive selection [J].
Anisimova, M ;
Bielawski, JP ;
Yang, ZH .
MOLECULAR BIOLOGY AND EVOLUTION, 2002, 19 (06) :950-958
[2]   Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution [J].
Anisimova, M ;
Bielawski, JP ;
Yang, ZH .
MOLECULAR BIOLOGY AND EVOLUTION, 2001, 18 (08) :1585-1592
[3]  
Burnham K.P., 2002, Model selection and multimodel inference: a practical information-theoretic approach, DOI 10.1007/978-1-4757-2917-7_3
[4]   Hearing silence: non-neutral evolution at synonymous sites in mammals [J].
Chamary, JV ;
Parmley, JL ;
Hurst, LD .
NATURE REVIEWS GENETICS, 2006, 7 (02) :98-108
[5]   A SINGLE-STRANDED GAP IN HUMAN-IMMUNODEFICIENCY-VIRUS UNINTEGRATED LINEAR DNA DEFINED BY A CENTRAL COPY OF THE POLYPURINE TRACT [J].
CHARNEAU, P ;
CLAVEL, F .
JOURNAL OF VIROLOGY, 1991, 65 (05) :2415-2421
[6]   HIV-1 REVERSE TRANSCRIPTION - A TERMINATION STEP AT THE CENTER OF THE GENOME [J].
CHARNEAU, P ;
MIRAMBEAU, G ;
ROUX, P ;
PAULOUS, S ;
BUC, H ;
CLAVEL, F .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 241 (05) :651-662
[7]   Mapping sites of positive selection and amino acid diversification in the HIV genome: An alternative approach to vaccine design? [J].
de Oliveira, T ;
Salemi, M ;
Gordon, M ;
Vandamme, AM ;
van Rensburg, E ;
Engelbrecht, S ;
Coovadia, HM ;
Cassol, S .
GENETICS, 2004, 167 (03) :1047-1058
[8]  
Durbin R., 1998, BIOL SEQUENCE ANAL
[9]   A hidden Markov Model approach to variation among sites in rate of evolution [J].
Felsenstein, J ;
Churchill, GA .
MOLECULAR BIOLOGY AND EVOLUTION, 1996, 13 (01) :93-104
[10]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376