Prediction of function divergence in protein families using the substitution rate variation parameter alpha

被引:8
作者
Abhiman, Saraswathi [1 ]
Daub, Carsten O. [1 ]
Sonnhammer, Erik L. L. [1 ]
机构
[1] Karolinska Inst, Ctr Genom & Bioinformat, Stockholm, Sweden
关键词
protein evolution; adaptive evolution; enzyme; protein function; protein subfamily; substitution rates; gamma distribution; alpha parameter; function shift;
D O I
10.1093/molbev/msl002
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein families typically embody a range of related functions and may thus be decomposed into subfamilies with, for example, distinct substrate specificities. Detection of functionally divergent subfamilies is possible by methods for recognizing branches of adaptive evolution in a gene tree. As the number of genome sequences is growing rapidly, it is highly desirable to automatically detect subfamily function divergence. To this end, we here introduce a method for large-scale prediction of function divergence within protein families. It is called the alpha shift measure (ASM) as it is based on detecting a shift in the shape parameter (alpha [alpha]) of the substitution rate gamma distribution. Four different methods for estimating a were investigated. We benchmarked the accuracy of ASM using function annotation from Enzyme Commission numbers within Pfam protein families divided into subfamilies by the automatic tree-based method BETE. In a test using 563 subfamily pairs in 162 families, ASM outperformed functional site-based methods using rate or conservation shifting (rate shift measure [RSM] and conservation shift measure [CSM]). The best results were obtained using the "GZ-Gamma" method for estimating alpha. By combining ASM with RSM and CSM using linear discriminant analysis, the prediction accuracy was further improved.
引用
收藏
页码:1406 / 1413
页数:8
相关论文
共 46 条
[1]   Large-scale prediction of function shift in protein families with a focus on enzymatic function [J].
Abhiman, S ;
Sonnhammer, ELL .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 60 (04) :758-768
[2]   FunShift: a database of function shift analysis on protein subfamilies [J].
Abhiman, S ;
Sonnhammer, ELL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D197-D200
[3]   ConSurf: An algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information [J].
Armon, A ;
Graur, D ;
Ben-Tal, N .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 307 (01) :447-463
[4]   Definitions of enzyme function for the structural genomics era [J].
Babbitt, PC .
CURRENT OPINION IN CHEMICAL BIOLOGY, 2003, 7 (02) :230-237
[5]   The ENZYME database in 2000 [J].
Bairoch, A .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :304-305
[6]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[7]   Inferring functional constraints and divergence in protein families using 3D mapping of phylogenetic information [J].
Blouin, C ;
Boucher, Y ;
Roger, AJ .
NUCLEIC ACIDS RESEARCH, 2003, 31 (02) :790-797
[8]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[9]  
Fitch W M, 1971, J Mol Evol, V1, P84, DOI 10.1007/BF01659396
[10]   Maximum-likelihood phylogenetic analysis under a covarion-like model [J].
Galtier, N .
MOLECULAR BIOLOGY AND EVOLUTION, 2001, 18 (05) :866-873