A novel approach to fold recognition using sequence-derived properties from sets of structurally similar local fragments of proteins

被引:28
作者
Hvidsten, Torgeir R. [1 ,2 ]
Kryshtafovych, Andriy [2 ]
Komorowski, Jan [1 ]
Fidelis, Krzysztof [2 ]
机构
[1] Uppsala Univ, Linnaeus Ctr Bioinformat, Uppsala, Sweden
[2] Lawrence Livermore Natl Lab, Livermore, CA USA
关键词
D O I
10.1093/bioinformatics/btg1064
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Comparative modeling methods can consistently produce reliable structural models for protein sequences with more than 25% sequence identity to proteins with known structure. However, there is a good chance that also sequences with lower sequence identity have their structural components represented in structural databases. To this end, we present a novel fragment-based method using sets of structurally similar local fragments of proteins. The approach differs from other fragment-based methods that use only single backbone fragments. Instead, we use a library of groups containing sets of sequence fragments with geometrically similar local structures and extract sequence related properties to assign these specific geometrical conformations to target sequences. We test the ability of the approach to recognize correct SCOP folds for 273 sequences from the 49 most popular folds. 49% of these sequences have the correct fold as their top prediction, while 82% have the correct fold in one of the top five predictions. Moreover, the approach shows no performance reduction on a subset of sequence targets with less than 10% sequence identity to any protein used to build the library.
引用
收藏
页码:II81 / II91
页数:11
相关论文
共 41 条
[1]   Do aligned sequences share the same fold? [J].
Abagyan, RA ;
Batalov, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 273 (01) :355-368
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   A novel fold recognition method using composite predicted secondary structures [J].
An, YL ;
Friesner, RA .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2002, 48 (02) :352-366
[4]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[5]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[6]   The ASTRAL compendium for protein structure and sequence analysis [J].
Brenner, SE ;
Koehl, P ;
Levitt, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :254-256
[7]   Prediction of local structure in proteins using a library of sequence-structure motifs [J].
Bystroff, C ;
Baker, D .
JOURNAL OF MOLECULAR BIOLOGY, 1998, 281 (03) :565-577
[8]   MODELING THE POLYPEPTIDE BACKBONE WITH SPARE PARTS FROM KNOWN PROTEIN STRUCTURES [J].
CLAESSENS, M ;
VANCUTSEM, E ;
LASTERS, I ;
WODAK, S .
PROTEIN ENGINEERING, 1989, 2 (05) :335-345
[9]   Multi-class protein fold recognition using support vector machines and neural networks [J].
Ding, CHQ ;
Dubchak, I .
BIOINFORMATICS, 2001, 17 (04) :349-358
[10]  
Efron B., 1993, INTRO BOOTSTRAP, V1st ed., DOI DOI 10.1201/9780429246593