Detecting coevolution without phylogenetic trees? Tree-ignorant metrics of coevolution perform as well as tree-aware metrics

被引:25
作者
Caporaso, J. Gregory [2 ]
Smit, Sandra [3 ]
Easton, Brett C. [4 ]
Hunter, Lawrence [5 ]
Huttley, Gavin A. [4 ]
Knight, Rob [1 ]
机构
[1] Univ Colorado, Dept Chem & Biochem, Boulder, CO 80309 USA
[2] Univ Colorado Denver, Dept Biochem & Mol Genet, Aurora, CO USA
[3] Vrije Univ Amsterdam, Ctr Integrat Bioinformat VU IBIVU, NL-1081 HV Amsterdam, Netherlands
[4] Australian Natl Univ, John Curtin Sch Med Res, Computat Genom Lab, Canberra, ACT 2601, Australia
[5] Univ Colorado Denver, Ctr Computat Pharmacol, Aurora, CO USA
基金
澳大利亚国家健康与医学研究理事会; 澳大利亚研究理事会;
关键词
D O I
10.1186/1471-2148-8-327
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Identifying coevolving positions in protein sequences has myriad applications, ranging from understanding and predicting the structure of single molecules to generating proteome-wide predictions of interactions. Algorithms for detecting coevolving positions can be classified into two categories: tree-aware, which incorporate knowledge of phylogeny, and tree-ignorant, which do not. Tree-ignorant methods are frequently orders of magnitude faster, but are widely held to be insufficiently accurate because of a confounding of shared ancestry with coevolution. We conjectured that by using a null distribution that appropriately controls for the shared-ancestry signal, tree-ignorant methods would exhibit equivalent statistical power to tree-aware methods. Using a novel t-test transformation of coevolution metrics, we systematically compared four tree-aware and five tree-ignorant coevolution algorithms, applying them to myoglobin and myosin. We further considered the influence of sequence recoding using reduced-state amino acid alphabets, a common tactic employed in coevolutionary analyses to improve both statistical and computational performance. Results: Consistent with our conjecture, the transformed tree-ignorant metrics (particularly Mutual Information) often outperformed the tree-aware metrics. Our examination of the effect of recoding suggested that charge-based alphabets were generally superior for identifying the stabilizing interactions in alpha helices. Performance was not always improved by recoding however, indicating that the choice of alphabet is critical. Conclusion: The results suggest that t-test transformation of tree-ignorant metrics can be sufficient to control for patterns arising from shared ancestry.
引用
收藏
页数:25
相关论文
共 47 条
[21]   PyCogent: a toolkit for making sense from sequence [J].
Knight, Rob ;
Maxwell, Peter ;
Birmingham, Amanda ;
Carnes, Jason ;
Caporaso, J. Gregory ;
Easton, Brett C. ;
Eaton, Michael ;
Hamady, Micah ;
Lindsay, Helen ;
Liu, Zongzhi ;
Lozupone, Catherine ;
McDonald, Daniel ;
Robeson, Michael ;
Sammut, Raymond ;
Smit, Sandra ;
Wakefield, Matthew J. ;
Widmann, Jeremy ;
Wikman, Shandy ;
Wilson, Stephanie ;
Ying, Hua ;
Huttley, Gavin A. .
GENOME BIOLOGY, 2007, 8 (08)
[22]   Measuring covariation in RNA alignments: physical realism improves information measures [J].
Lindgreen, S. ;
Gardner, P. P. ;
Krogh, A. .
BIOINFORMATICS, 2006, 22 (24) :2988-2995
[23]   Evolutionarily conserved pathways of energetic connectivity in protein families [J].
Lockless, SW ;
Ranganathan, R .
SCIENCE, 1999, 286 (5438) :295-299
[24]   UniFrac: a new phylogenetic method for comparing microbial communities [J].
Lozupone, C ;
Knight, R .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2005, 71 (12) :8228-8235
[25]   HELIX STABILIZATION BY GLU- ... LYS+ SALT BRIDGES IN SHORT PEPTIDES OF DENOVO DESIGN [J].
MARQUSEE, S ;
BALDWIN, RL .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1987, 84 (24) :8898-8902
[26]   Using information theory to search for co-evolving residues in proteins [J].
Martin, LC ;
Gloor, GB ;
Dunn, SD ;
Wahl, LM .
BIOINFORMATICS, 2005, 21 (22) :4116-4124
[27]   Statistical analysis of intrahelical ionic interactions in α-helices and coiled coils [J].
Meier, Markus ;
Burkhard, Peter .
JOURNAL OF STRUCTURAL BIOLOGY, 2006, 155 (02) :116-129
[28]   Correlated mutations contain information about protein-protein interaction [J].
Pazos, F ;
HelmerCitterich, M ;
Ausiello, G ;
Valencia, A .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 271 (04) :511-523
[29]   Effectiveness of correlation analysis in identifying protein residues undergoing correlated evolution [J].
Pollock, DD ;
Taylor, WR .
PROTEIN ENGINEERING, 1997, 10 (06) :647-657
[30]   Coevolving protein residues: Maximum likelihood identification and relationship to structure [J].
Pollock, DD ;
Taylor, WR ;
Goldman, N .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 287 (01) :187-198