A nucleotide substitution model with nearest-neighbour interactions

被引:67
作者
Lunter, Gerton [1 ]
Hein, Jotun [1 ]
机构
[1] Univ Oxford, Dept Stat, Bioinformat Grp, Oxford OX1 3TG, England
关键词
D O I
10.1093/bioinformatics/bth901
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: It is well known that neighbouring nucleotides in DNA sequences do not mutate independently of each other. In this paper, we introduce a context-dependent substitution model and derive an algorithm to calculate the likelihood of sequences evolving under this model. We use this algorithm to estimate neighbour-dependent substitution rates, as well as rates for dinucleotide substitutions, using a Bayesian sampling procedure. The model is irreversible, giving an arrow to time, and allowing the position of the root between a pair of sequences to be inferred without using out-groups. Results: We applied the model upon aligned human-mouse non-coding data. Clear neighbour dependencies were observed, including 17-18-fold increased CpG to TpG/CpA rates compared with other substitutions. Root inference positioned the root halfway the mouse and human tips, suggesting an approximately clock-like behaviour of the irreversible part of the subsitution process.
引用
收藏
页码:216 / 223
页数:8
相关论文
共 16 条
[1]   DNA sequence evolution with neighbor-dependent mutation [J].
Arndt, PF ;
Burge, CB ;
Hwa, T .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2003, 10 (3-4) :313-322
[2]   Evidence for a high frequency of simultaneous double-nucleotide substitutions [J].
Averof, M ;
Rokas, A ;
Wolfe, KH ;
Sharp, PM .
SCIENCE, 2000, 287 (5456) :1283-1286
[3]   MEAN-FIELD (N,M)-CLUSTER APPROXIMATION FOR LATTICE MODELS [J].
BENAVRAHAM, D ;
KOHLER, J .
PHYSICAL REVIEW A, 1992, 45 (12) :8358-8370
[4]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[5]   Probabilistic models of DNA sequence evolution with context dependent rates of substitution [J].
Jensen, JL ;
Pedersen, AMK .
ADVANCES IN APPLIED PROBABILITY, 2000, 32 (02) :499-517
[6]  
KARLIN S, 1995, TRENDS GENET, V11, P283
[7]   Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later [J].
Moler, C ;
Van Loan, C .
SIAM REVIEW, 2003, 45 (01) :3-49
[8]   A dependent-rates model and an MCMC-based methodology for the maximum-likelihood analysis of sequences with overlapping reading frames [J].
Pedersen, AMK ;
Jensen, JL .
MOLECULAR BIOLOGY AND EVOLUTION, 2001, 18 (05) :763-776
[9]   Phylogenetic estimation of context-dependent substitution rates by maximum likelihood [J].
Siepel, A ;
Haussler, D .
MOLECULAR BIOLOGY AND EVOLUTION, 2004, 21 (03) :468-488
[10]  
Siepel A., 2003, P 7 ANN INT C COMPUT, P277