The estimation of relative site variability among aligned homologous protein sequences

被引:12
作者
Horner, DS [1 ]
Pesole, G [1 ]
机构
[1] Univ Milan, Dipartimento Fisiol & Biochim Gen, I-20113 Milan, Italy
关键词
D O I
10.1093/bioinformatics/btg063
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Maximum likelihood-based methods to estimate site by site substitution rate variability in aligned homologous protein sequences rely on the formulation of a phylogenetic tree and generally assume that the patterns of relative variability follow a pre-determined distribution. We present a phylogenetic tree-independent method to estimate the relative variability of individual sites within large datasets of homologous protein sequences. It is based upon two simple assumptions. Firstly that substitutions observed between two closely related sequences are likely, in general, to occur at the most variable sites. Secondly that non-conservative amino acid substitutions tend to occur at more variable sites. Our methodology makes no assumptions regarding the underlying pattern of relative variability between sites. Results: We have compared, using data simulated under a non-gamma distributed model, the performance of this approach to that of a maximum likelihood method that assumes gamma distributed rates. At low mean rates of evolution our method inferred site by site relative substitution rates more accurately than the maximum likelihood approach in the absence of prior assumptions about the relationships between sequences. Our method does not directly account for the effects of mutational saturation, However, we have incorporated an 'ad-hoc' modification that allows the accurate estimation of relative site variability in fast evolving and saturated datasets.
引用
收藏
页码:600 / 606
页数:7
相关论文
共 19 条
[1]   Vagaries of the molecular clock [J].
Ayala, FJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (15) :7776-7783
[2]   Taking variation of evolutionary rates between sites into account in inferring phylogenies [J].
Felsenstein, J .
JOURNAL OF MOLECULAR EVOLUTION, 2001, 53 (4-5) :447-455
[3]  
Grassly NC, 1997, COMPUT APPL BIOSCI, V13, P559
[4]   DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family [J].
Gu, X ;
Vander Velden, K .
BIOINFORMATICS, 2002, 18 (03) :500-501
[5]   A simple method for estimating the parameter of substitution rate variation among sites [J].
Gu, X ;
Zhang, JZ .
MOLECULAR BIOLOGY AND EVOLUTION, 1997, 14 (11) :1106-1113
[6]   Iron hydrogenases and the evolution of anaerobic eukaryotes [J].
Horner, DS ;
Foster, PG ;
Embley, TM .
MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (11) :1695-1709
[7]   Evolution of chlorophyll and bacteriochlorophyll: The problem of invariant sites in sequence analysis [J].
Lockhart, PJ ;
Larkum, AWD ;
Steel, MA ;
Waddell, PJ ;
Penny, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (05) :1930-1934
[8]  
Lopez P, 2002, MOL BIOL EVOL, V19, P1
[9]  
Otto SP, 2002, ADV GENET, V46, P451
[10]  
Pesole G, 2001, GENETICS, V157, P859