Multiple sequence information for threading algorithms

被引:29
作者
Defay, TR
Cohen, FE
机构
[1] UNIV CALIF SAN FRANCISCO,DEPT MOL & CELLULAR PHARMACOL,SAN FRANCISCO,CA 94143
[2] UNIV CALIF SAN FRANCISCO,DEPT MED,SAN FRANCISCO,CA 94143
[3] UNIV CALIF SAN FRANCISCO,DEPT BIOCHEM & BIOPHYS,SAN FRANCISCO,CA 94143
[4] UNIV CALIF SAN FRANCISCO,DEPT PHARMACEUT CHEM,SAN FRANCISCO,CA 94143
[5] UNIV CALIF SAN FRANCISCO,GRAD GRP BIOPHYS,SAN FRANCISCO,CA 94143
关键词
protein structure prediction; threading; fold recognition; relative solvent accessibility; multiple sequence alignments;
D O I
10.1006/jmbi.1996.0515
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Threading algorithms attempt to solve the inverse protein folding problem: given a group of structures and a sequence, identify the structure that is most compatible with this sequence. A recent study of this class of algorithms by S. J. Wodak and colleagues suggests that while threading algorithms are capable of recognizing may folding motifs, their performance in truly blind predictions is disappointing, and the underlying alignments upon which the selections are based are frequently errant. To help overcome this problem we have developed a Test of Optimal Mutagenesis algorithm (TOM) that exploits information inherent in the variation between several homologues in a multiple sequence alignment. This information is used to help select the correct structural motif for the sequence from a database of known structures. A total of 305 high-resolution structures were selected to represent the set of known folds; 56 proteins were chosen that had at least one close structural match in this set. To test TOM, we attempted to determine which of the 305 folds was a match to each of the 56 protein sequences. TOM correctly predicts a close structural match for 45% of these proteins. THREADER, an algorithm chosen as a literature standard, correctly matched 20% of the test set. By comparing the performance of TOM, THREADER, and TOM NOVAR (a version of TOM without variability information), we conclude that the tendency of an amino acid to be buried or exposed is the dominant determinant of the success of threading algorithms. In addition, the structural alignments produced by TOM suggest that the exact alignment of just 30 to 50% of the residues in a sequence with the correct fold is necessary to select it as the highest scoring match in a set of folds. (C) 1996 Academic Press Limited
引用
收藏
页码:314 / 323
页数:10
相关论文
共 21 条
  • [1] BONA-FIDE PREDICTION OF ASPECTS OF PROTEIN CONFORMATION - ASSIGNING INTERIOR AND SURFACE RESIDUES FROM PATTERNS OF VARIATION AND CONSERVATION IN HOMOLOGOUS PROTEIN SEQUENCES
    BENNER, SA
    BADCOE, I
    COHEN, MA
    GERLOFF, DL
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1994, 235 (03) : 926 - 958
  • [2] A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE
    BOWIE, JU
    LUTHY, R
    EISENBERG, D
    [J]. SCIENCE, 1991, 253 (5016) : 164 - 170
  • [3] IDENTIFICATION OF PROTEIN FOLDS - MATCHING HYDROPHOBICITY PATTERNS OF SEQUENCE SETS WITH SOLVENT ACCESSIBILITY PATTERNS OF KNOWN STRUCTURES
    BOWIE, JU
    CLARKE, ND
    PABO, CO
    SAUER, RT
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1990, 7 (03): : 257 - 264
  • [4] DAYHOFF MO, 1972, GENE DUPLICATIONS EV
  • [5] A surface of minimum area metric for the structural comparison of proteins
    Falicov, A
    Cohen, FE
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1996, 258 (05) : 871 - 892
  • [6] Fischer D, 1996, PROTEIN SCI, V5, P947
  • [7] SEQUENCE STRUCTURE MATCHING IN GLOBULAR-PROTEINS - APPLICATION TO SUPERSECONDARY AND TERTIARY STRUCTURE DETERMINATION
    GODZIK, A
    SKOLNICK, J
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (24) : 12098 - 12102
  • [8] IDENTIFICATION OF NATIVE PROTEIN FOLDS AMONGST A LARGE NUMBER OF INCORRECT MODELS - THE CALCULATION OF LOW-ENERGY CONFORMATIONS FROM POTENTIALS OF MEAN FORCE
    HENDLICH, M
    LACKNER, P
    WEITCKUS, S
    FLOECKNER, H
    FROSCHAUER, R
    GOTTSBACHER, K
    CASARI, G
    SIPPL, MJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 216 (01) : 167 - 180
  • [9] HOBOHM U, 1992, PROTEIN SCI, V1, P409
  • [10] A NEW APPROACH TO PROTEIN FOLD RECOGNITION
    JONES, DT
    TAYLOR, WR
    THORNTON, JM
    [J]. NATURE, 1992, 358 (6381) : 86 - 89