COMPASS: A tool for comparison of multiple protein alignments with assessment of statistical significance

被引:207
作者
Sadreyev, R
Grishin, N
机构
[1] Univ Texas, SW Med Ctr, Howard Hughes Med Inst, Dallas, TX 75390 USA
[2] Univ Texas, SW Med Ctr, Dept Biochem, Dallas, TX 75390 USA
关键词
sequence similarity searches; profile-profile comparison; sequence profiles; protein structure prediction; CTF/NFI;
D O I
10.1016/S0022-2836(02)01371-2
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present a novel method for the comparison of multiple protein alignments with assessment of statistical significance (COMPASS). The method derives numerical profiles from alignments, constructs optimal local profile-profile alignments and analytically estimates E-values for the detected similarities. The scoring system and E-value calculation are based on a generalization of the PSI-BLAST approach to profile-sequence comparison, which is adapted for the profile-profile case. Tested along with existing methods for profile-sequence (PSI-BLAST) and profile-profile (prof_sim) comparison, COMPASS shows increased abilities for sensitive and selective detection of remote sequence similarities, as well as improved quality of local alignments. The method allows prediction of relationships between protein families in the PFAM database beyond the range of conventional methods. Two predicted relations with high significance are similarities between various Rossmann-type folds and between various helix-turn-helix-containing families. The potential value of COMPASS for structure/function predictions is illustrated by the detection of an intricate homology between the DNA-binding domain of the CTF/NFI family and the MH1 domain of the Smad family. (C) 2003 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:317 / 336
页数:20
相关论文
共 65 条
  • [31] METHODS FOR ASSESSING THE STATISTICAL SIGNIFICANCE OF MOLECULAR SEQUENCE FEATURES BY USING GENERAL SCORING SCHEMES
    KARLIN, S
    ALTSCHUL, SF
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1990, 87 (06) : 2264 - 2268
  • [32] STATISTICAL COMPOSITION OF HIGH-SCORING SEGMENTS FROM MOLECULAR SEQUENCES
    KARLIN, S
    DEMBO, A
    KAWABATA, T
    [J]. ANNALS OF STATISTICS, 1990, 18 (02) : 571 - 581
  • [33] Karplus K, 1999, PROTEINS, P121
  • [34] Drosophila MAD binds to DNA and directly mediates activation of vestigial by decapentaplegic
    Kim, J
    Johnson, K
    Chen, HJ
    Carroll, S
    Laughon, A
    [J]. NATURE, 1997, 388 (6639) : 304 - 308
  • [35] MOLSCRIPT - A PROGRAM TO PRODUCE BOTH DETAILED AND SCHEMATIC PLOTS OF PROTEIN STRUCTURES
    KRAULIS, PJ
    [J]. JOURNAL OF APPLIED CRYSTALLOGRAPHY, 1991, 24 : 946 - 950
  • [36] HIDDEN MARKOV-MODELS IN COMPUTATIONAL BIOLOGY - APPLICATIONS TO PROTEIN MODELING
    KROGH, A
    BROWN, M
    MIAN, IS
    SJOLANDER, K
    HAUSSLER, D
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1994, 235 (05) : 1501 - 1531
  • [37] Consistency analysis of similarity between multiple alignments: Prediction of protein function and fold structure from analysis of local sequence motifs
    Kunin, V
    Chan, B
    Sitbon, E
    Lithwick, G
    Pietrokovski, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2001, 307 (03) : 939 - 949
  • [38] Structural evidence for evolution of the β/α barrel scaffold by gene duplication and fusion
    Lang, D
    Thoma, R
    Henn-Sax, M
    Sterner, R
    Wilmanns, M
    [J]. SCIENCE, 2000, 289 (5484) : 1546 - 1550
  • [39] DETECTING SUBTLE SEQUENCE SIGNALS - A GIBBS SAMPLING STRATEGY FOR MULTIPLE ALIGNMENT
    LAWRENCE, CE
    ALTSCHUL, SF
    BOGUSKI, MS
    LIU, JS
    NEUWALD, AF
    WOOTTON, JC
    [J]. SCIENCE, 1993, 262 (5131) : 208 - 214
  • [40] Recent improvements to the SMART domain-based sequence annotation resource
    Letunic, I
    Goodstadt, L
    Dickens, NJ
    Doerks, T
    Schultz, J
    Mott, R
    Ciccarelli, F
    Copley, RR
    Ponting, CP
    Bork, P
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 242 - 244