Detecting coevolution without phylogenetic trees? Tree-ignorant metrics of coevolution perform as well as tree-aware metrics

被引:25
作者
Caporaso, J. Gregory [2 ]
Smit, Sandra [3 ]
Easton, Brett C. [4 ]
Hunter, Lawrence [5 ]
Huttley, Gavin A. [4 ]
Knight, Rob [1 ]
机构
[1] Univ Colorado, Dept Chem & Biochem, Boulder, CO 80309 USA
[2] Univ Colorado Denver, Dept Biochem & Mol Genet, Aurora, CO USA
[3] Vrije Univ Amsterdam, Ctr Integrat Bioinformat VU IBIVU, NL-1081 HV Amsterdam, Netherlands
[4] Australian Natl Univ, John Curtin Sch Med Res, Computat Genom Lab, Canberra, ACT 2601, Australia
[5] Univ Colorado Denver, Ctr Computat Pharmacol, Aurora, CO USA
基金
澳大利亚国家健康与医学研究理事会; 澳大利亚研究理事会;
关键词
D O I
10.1186/1471-2148-8-327
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Identifying coevolving positions in protein sequences has myriad applications, ranging from understanding and predicting the structure of single molecules to generating proteome-wide predictions of interactions. Algorithms for detecting coevolving positions can be classified into two categories: tree-aware, which incorporate knowledge of phylogeny, and tree-ignorant, which do not. Tree-ignorant methods are frequently orders of magnitude faster, but are widely held to be insufficiently accurate because of a confounding of shared ancestry with coevolution. We conjectured that by using a null distribution that appropriately controls for the shared-ancestry signal, tree-ignorant methods would exhibit equivalent statistical power to tree-aware methods. Using a novel t-test transformation of coevolution metrics, we systematically compared four tree-aware and five tree-ignorant coevolution algorithms, applying them to myoglobin and myosin. We further considered the influence of sequence recoding using reduced-state amino acid alphabets, a common tactic employed in coevolutionary analyses to improve both statistical and computational performance. Results: Consistent with our conjecture, the transformed tree-ignorant metrics (particularly Mutual Information) often outperformed the tree-aware metrics. Our examination of the effect of recoding suggested that charge-based alphabets were generally superior for identifying the stabilizing interactions in alpha helices. Performance was not always improved by recoding however, indicating that the choice of alphabet is critical. Conclusion: The results suggest that t-test transformation of tree-ignorant metrics can be sufficient to control for patterns arising from shared ancestry.
引用
收藏
页数:25
相关论文
共 47 条
[1]  
[Anonymous], 1978, Atlas of protein sequence and structure
[2]   Solving the protein sequence metric problem [J].
Atchley, WR ;
Zhao, JP ;
Fernandes, AD ;
Drüke, T .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (18) :6395-6400
[3]   Predicting functional gene links from phylogenetic-statistical analyses of whole genomes [J].
Barker, D ;
Pagel, M .
PLOS COMPUTATIONAL BIOLOGY, 2005, 1 (01) :24-31
[4]   ALANINE SCANNING MUTAGENESIS OF THE ALPHA-HELIX-115-123 OF PHAGE-T4 LYSOZYME - EFFECTS ON STRUCTURE, STABILITY AND THE BINDING OF SOLVENT [J].
BLABER, M ;
BAASE, WA ;
GASSNER, N ;
MATTHEWS, BW .
JOURNAL OF MOLECULAR BIOLOGY, 1995, 246 (02) :317-330
[5]   Bioinformatics assessment of β-myosin mutations reveals myosin's high sensitivity to mutations [J].
Buvoli, Massimo ;
Hamady, Micah ;
Leinwand, Leslie A. ;
Knight, Rob .
TRENDS IN CARDIOVASCULAR MEDICINE, 2008, 18 (04) :141-149
[6]   Detecting coevolving amino acid sites using Bayesian mutational mapping [J].
Dimmic, MW ;
Hubisz, MJ ;
Bustamante, CD ;
Nielsen, R .
BIOINFORMATICS, 2005, 21 :I126-I135
[7]   Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction [J].
Dunn, S. D. ;
Wahl, L. M. ;
Gloor, G. B. .
BIOINFORMATICS, 2008, 24 (03) :333-340
[8]   A model-based approach for detecting coevolving positions in a molecule [J].
Dutheil, J ;
Pupko, T ;
Jean-Marie, A ;
Galtier, N .
MOLECULAR BIOLOGY AND EVOLUTION, 2005, 22 (09) :1919-1928
[9]   Detecting groups of coevolving positions in a molecule: a clustering approach [J].
Dutheil, Julien ;
Galtier, Nicolas .
BMC EVOLUTIONARY BIOLOGY, 2007, 7 (1)
[10]   A probabilistic method to identify compensatory substitutions for pathogenic mutations [J].
Easton, B. C. ;
Isaev, A. V. ;
Huttley, G. A. ;
Maxwell, P. .
PROCEEDINGS OF THE 5TH ASIA- PACIFIC BIOINFOMATICS CONFERENCE 2007, 2007, 5 :195-+