MANIFOLD: protein fold recognition based on secondary structure, sequence similarity and enzyme classification

被引:31
作者
Bindewald, E
Cestaro, A
Hesser, J
Heiler, M
Tosatto, SCE
机构
[1] Univ Mannheim, D-68131 Mannheim, Germany
[2] SUNY Buffalo, Ctr Excellence Bioinformat, Buffalo, NY 14203 USA
[3] Univ Padua, CRIBI, Ctr Biotechnol, I-35121 Padua, Italy
[4] Univ Mannheim, Chair Comp Vis Graph & Pattern Recognit, D-68131 Mannheim, Germany
来源
PROTEIN ENGINEERING | 2003年 / 16卷 / 11期
关键词
classification; enzyme code; protein fold recognition; secondary structure;
D O I
10.1093/protein/gzg106
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present a protein fold recognition method, MANIFOLD, which uses the similarity between target and template proteins in predicted secondary structure, sequence and enzyme code to predict the fold of the target protein. We developed a non-linear ranking scheme in order to combine the scores of the three different similarity measures used. For a difficult test set of proteins with very little sequence similarity, the program predicts the fold class correctly in 34% of cases. This is an over twofold increase in accuracy compared with sequence-based methods such as PSI-BLAST or GenTHREADER, which score 13-14% correct first hits for the same test set. The functional similarity term increases the prediction accuracy by up to 3% compared with using the combination of secondary structure similarity and PSI-BLAST alone. We argue that using functional and secondary structure information can increase the fold recognition beyond sequence similarity.
引用
收藏
页码:785 / 789
页数:5
相关论文
共 30 条
[1]   Simple consensus procedures are effective and sufficient in secondary structure prediction [J].
Albrecht, M ;
Tosatto, SCE ;
Lengauer, T ;
Valle, G .
PROTEIN ENGINEERING, 2003, 16 (07) :459-462
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[4]  
BARTLETT GJ, 2003, STRUCTURAL BIOINFORM, P387
[5]  
BJUNICKI JM, 2001, PROTEINS, V5, P184
[6]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[7]   AN EMPIRICAL ENERGY FUNCTION FOR THREADING PROTEIN-SEQUENCE THROUGH THE FOLDING MOTIF [J].
BRYANT, SH ;
LAWRENCE, CE .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 1993, 16 (01) :92-112
[8]   Multi-class protein fold recognition using support vector machines and neural networks [J].
Ding, CHQ ;
Dubchak, I .
BIOINFORMATICS, 2001, 17 (04) :349-358
[9]  
Fischer D, 2000, Pac Symp Biocomput, P119
[10]   TOPOLOGY FINGERPRINT APPROACH TO THE INVERSE PROTEIN FOLDING PROBLEM [J].
GODZIK, A ;
KOLINSKI, A ;
SKOLNICK, J .
JOURNAL OF MOLECULAR BIOLOGY, 1992, 227 (01) :227-238