Combination of threading potentials and sequence profiles improves fold recognition

被引:93
作者
Panchenko, AR [1 ]
Marchler-Bauer, A [1 ]
Bryant, SH [1 ]
机构
[1] Natl Ctr Biotechnol Informat, Computat Biol Branch, NIH, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
protein evolution; fold recognition; threading; sequence profiles;
D O I
10.1006/jmbi.2000.3541
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Using a benchmark set of structurally similar proteins, we conduct a series of threading experiments intended to identify a scoring function with an optimal combination of contact-potential and sequence-profile terms. The benchmark set is selected to include many medium-difficulty fold recognition targets, where sequence similarity is undetectable by BLAST but structural similarity is extensive. The contact potential is based on the log-odds of non-local contacts involving different amino acid pairs, in native as opposed to randomly compacted structures. The sequence profile term is that used in PSI-BLAST. We find that combination of these terms significantly improves the success rate of fold recognition over use of either term alone, with respect to both recognition sensitivity and the accuracy of threading models. Improvement is greatest for targets between 10 % and 20 % sequence identity and 60 % to 80 % superimposable residues, where the number of models crossing critical accuracy and significance thresholds more than doubles. We suggest that these improvements account for the successful performance of the combined scoring function at CASP3. We discuss possible explanations as to why sequence-profile and contact-potential terms appear complementary.
引用
收藏
页码:1319 / 1331
页数:13
相关论文
共 57 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches [J].
Aravind, L ;
Koonin, EV .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 287 (05) :1023-1040
[3]   Protein sequence motifs [J].
Bork, P ;
Koonin, EV .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1996, 6 (03) :366-376
[4]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[5]   IDENTIFICATION OF PROTEIN FOLDS - MATCHING HYDROPHOBICITY PATTERNS OF SEQUENCE SETS WITH SOLVENT ACCESSIBILITY PATTERNS OF KNOWN STRUCTURES [J].
BOWIE, JU ;
CLARKE, ND ;
PABO, CO ;
SAUER, RT .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1990, 7 (03) :257-264
[6]  
BRYANT SH, 1987, INT J PEPT PROT RES, V29, P46
[7]   STATISTICS OF SEQUENCE-STRUCTURE THREADING [J].
BRYANT, SH ;
ALTSCHUL, SF .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1995, 5 (02) :236-244
[8]  
Bryant SH, 1996, PROTEINS, V26, P172
[9]   AN EMPIRICAL ENERGY FUNCTION FOR THREADING PROTEIN-SEQUENCE THROUGH THE FOLDING MOTIF [J].
BRYANT, SH ;
LAWRENCE, CE .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 1993, 16 (01) :92-112
[10]  
Chambers J.M., 1998, PROGRAMMING DATA GUI