Using multiple interdependency to separate functional from phylogenetic correlations in protein alignments

被引:98
作者
Tillier, ERM [1 ]
Lui, TWH [1 ]
机构
[1] Univ Hlth Network, Ontario Canc Inst, Toronto, ON M5G 2M9, Canada
关键词
D O I
10.1093/bioinformatics/btg072
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Multiple sequence alignments of homologous proteins are useful for inferring their phylogenetic history and to reveal functionally important regions in the proteins. Functional constraints may lead to co-variation of two or more amino acids in the sequence, such that a substitution at one site is accompanied by compensatory substitutions at another site. It is not sufficient to find the statistical correlations between sites in the alignment because these may be the result of several undetermined causes. In particular, phylogenetic clustering will lead to many strong correlations. Results: A procedure is developed to detect statistical correlations stemming from functional interaction by removing the strong phylogenetic signal that leads to the correlations of each site with many others in the sequence. Our method relies upon the accuracy of the alignment but it does not require any assumptions about the phylogeny or the substitution process. The effectiveness of the method was verified using computer simulations and then applied to predict functional interactions between amino acids in the Pfam database of alignments.
引用
收藏
页码:750 / 755
页数:6
相关论文
共 19 条
  • [1] AIMON AL, 2002, PNAS, V99, P2912
  • [2] ASH RB, 1965, INFORMATION THEORY
  • [3] Correlations among amino acid sites in bHLH protein domains: An information theoretic analysis
    Atchley, WR
    Wollenberg, KR
    Fitch, WM
    Terhalle, W
    Dress, AW
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (01) : 164 - 178
  • [4] Barnett V., 1994, Outliers in Statistical Data, V3rd
  • [5] Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
  • [6] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [7] Chiu DKY, 2000, PROCEEDINGS OF THE FIFTH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1 AND 2, pA815
  • [8] CHIU DKY, 1991, COMPUT APPL BIOSCI, V7, P347
  • [9] Felsenstein J., 2001, PHYLIP PHYLOGENY INF
  • [10] IDENTIFYING CONSTRAINTS ON THE HIGHER-ORDER STRUCTURE OF RNA - CONTINUED DEVELOPMENT AND APPLICATION OF COMPARATIVE SEQUENCE-ANALYSIS METHODS
    GUTELL, RR
    POWER, A
    HERTZ, GZ
    PUTZ, EJ
    STORMO, GD
    [J]. NUCLEIC ACIDS RESEARCH, 1992, 20 (21) : 5785 - 5795