Within the twilight zone: A sensitive profile-profile comparison tool based on information theory

被引:203
作者
Yona, G [1 ]
Levitt, M [1 ]
机构
[1] Stanford Univ, Dept Biol Struct, Stanford, CA 94305 USA
关键词
profile-profile comparison; PSI-BLAST; structural alignment; remote homologies;
D O I
10.1006/jmbi.2001.5293
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
This paper presents a novel approach to profile-profile comparison. The method compares two input profiles (like those that are generated by PSI-BLAST) and assigns a similarity score to assess their statistical similarity. Our profile-profile comparison tool, which allows for gaps, can be used to detect weak similarities between protein families. It has also been optimized to produce alignments that are in very good agreement with structural alignments. Tests show that the profile-profile alignments are indeed highly correlated with similarities between secondary structure elements and tertiary structure. Exhaustive evaluations show that our method is significantly more sensitive in detecting distant homologies than the popular profile-based search programs PSI-BLAST and IMPALA. The relative improvement is the same order of magnitude as the improvement of PSI-BLAST relative to BLAST. Our new tool often detects similarities that fall within the twilight zone of sequence similarity. (C) 2002 Elsevier Science Ltd.
引用
收藏
页码:1257 / 1275
页数:19
相关论文
共 57 条
  • [31] Lyngso R B, 1999, Proc Int Conf Intell Syst Mol Biol, P178
  • [32] Protein folds and functions
    Martin, AC
    Orengo, CA
    Hutchinson, EG
    Jones, S
    Karmirantzou, M
    Laskowski, RA
    Mitchell, JB
    Taroni, C
    Thornton, JM
    [J]. STRUCTURE, 1998, 6 (07) : 875 - 884
  • [33] OB(OLIGONUCLEOTIDE OLIGOSACCHARIDE BINDING)-FOLD - COMMON STRUCTURAL AND FUNCTIONAL SOLUTION FOR NONHOMOLOGOUS SEQUENCES
    MURZIN, AG
    [J]. EMBO JOURNAL, 1993, 12 (03) : 861 - 867
  • [34] MURZIN AG, 1995, J MOL BIOL, V247, P536, DOI 10.1016/S0022-2836(05)80134-2
  • [35] A GENERAL METHOD APPLICABLE TO SEARCH FOR SIMILARITIES IN AMINO ACID SEQUENCE OF 2 PROTEINS
    NEEDLEMAN, SB
    WUNSCH, CD
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1970, 48 (03) : 443 - +
  • [36] CATH - a hierarchic classification of protein domain structures
    Orengo, CA
    Michie, AD
    Jones, S
    Jones, DT
    Swindells, MB
    Thornton, JM
    [J]. STRUCTURE, 1997, 5 (08) : 1093 - 1108
  • [37] Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods
    Park, J
    Karplus, K
    Barrett, C
    Hughey, R
    Haussler, D
    Hubbard, T
    Chothia, C
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1998, 284 (04) : 1201 - 1210
  • [38] Pearson WR, 1997, COMPUT APPL BIOSCI, V13, P325
  • [39] Empirical statistical estimates for sequence similarity searches
    Pearson, WR
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1998, 276 (01) : 71 - 84
  • [40] Searching databases of conserved sequence regions by aligning protein multiple-alignments
    Pietrokovski, S
    [J]. NUCLEIC ACIDS RESEARCH, 1996, 24 (19) : 3836 - 3845