Development and large scale benchmark testing of the PROSPECTOR_3 threading algorithm

被引:124
作者
Skolnick, J [1 ]
Kihara, D [1 ]
Zhang, Y [1 ]
机构
[1] SUNY Buffalo, Ctr Excellence Bioinformat, Buffalo, NY 14203 USA
关键词
protein structure prediction; fold recognition; structural alignment; weakly homologous/analogous proteins; M; genitalium; E; coli; S; cerevisiae; genomes;
D O I
10.1002/prot.20106
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
This article describes the PROSPECTOR_3 threading algorithm, which combines various scoring functions designed to match structurally related. target/template pairs. Each variant described was found to have a Z-score above which most identified templates have good structural (threading) alignments, Z(struct) (Z(good)). 'Easy' targets with accurate threading alignments are identified as single templates with Z > Z(good) or two templates, each with Z > Z(struct), having a good consensus structure in mutually aligned regions. 'Medium' targets have a pair of templates lacking a consensus structure, or a single template for which Z(struct) < Z < Z(good). PROSPECTOR_3 was applied to a comprehensive Protein Data Bank (PDB) benchmark composed of 1491 single domain proteins, 41-200 residues long and no more than 30% identical to any threading template. Of the proteins, 878 were found to be easy targets, with 761 having a root mean square deviation (RMSD) from native of less than 6.5 Angstrom. The average contact prediction accuracy was 46%, and on average 17.6 residue continuous fragments were predicted with RMSD values of 2.0 Angstrom. There were 606 medium targets identified, 87% (31%) of which had good structural (threading) alignments. On average, 9.1 residue, continuous fragments with RMSD of 2.5 Angstrom were predicted. Combining easy and medium sets, 63% (91%) of the targets had good threading (structural) alignments compared to native; the average target/template sequence identity was 22%. Only nine targets lacked matched templates. Moreover, PROSPECTOR_3 consistently outperforms PSI-BLAST. Similar results were predicted for open reading frames (ORFS)less than or equal to200 residues in the M. genitalium, E. coli and S. cerevisiae genomes. Thus, progress has been made in identification of weakly homologous/analogous proteins, with very high alignment cover. age, both in a comprehensive PDB benchmark as well as in genomes. (C) 2004 Wiley-Liss, Inc.
引用
收藏
页码:502 / 518
页数:17
相关论文
共 78 条
[11]  
Bonneau R, 2001, PROTEINS, P119
[12]   Contact order and ab initio protein structure prediction [J].
Bonneau, R ;
Ruczinski, I ;
Tsai, J ;
Baker, D .
PROTEIN SCIENCE, 2002, 11 (08) :1937-1944
[13]  
Buchanan SG, 2002, CURR OPIN DRUG DISC, V5, P367
[14]   Structural genomics: A pipeline for providing structures for the biologist [J].
Chance, MR ;
Bresnick, AR ;
Burley, SK ;
Jiang, JS ;
Lima, CD ;
Sali, A ;
Almo, SC ;
Bonanno, JB ;
Buglino, JA ;
Boulton, S ;
Chen, H ;
Eswar, N ;
He, GS ;
Huang, R ;
Ilyin, V ;
McMahan, L ;
Pieper, U ;
Ray, S ;
Vidal, M ;
Wang, LK .
PROTEIN SCIENCE, 2002, 11 (04) :723-738
[15]  
Chao K M, 1994, J Comput Biol, V1, P271, DOI 10.1089/cmb.1994.1.271
[16]   Multiple sequence alignment with the Clustal series of programs [J].
Chenna, R ;
Sugawara, H ;
Koike, T ;
Lopez, R ;
Gibson, TJ ;
Higgins, DG ;
Thompson, JD .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3497-3500
[17]   Enhanced functional annotation of protein sequences via the use of structural descriptors [J].
Di Gennaro, JA ;
Siew, N ;
Hoffman, BT ;
Zhang, L ;
Skolnick, J ;
Neilson, LI ;
Fetrow, JS .
JOURNAL OF STRUCTURAL BIOLOGY, 2001, 134 (2-3) :232-245
[18]   Progress in predicting inter-residue contacts of proteins with neural networks and correlated mutations [J].
Fariselli, P ;
Olmea, O ;
Valencia, A ;
Casadio, R .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2001, :157-162
[19]   Genomic-scale comparison of sequence- and structure-based methods of function prediction: Does structure provide additional insight? [J].
Fetrow, JS ;
Siew, N ;
Di Gennaro, JA ;
Martinez-Yamout, M ;
Dyson, HJ ;
Skolnick, J .
PROTEIN SCIENCE, 2001, 10 (05) :1005-1014
[20]   3D-SHOTGUN: A novel, cooperative, fold-recognition meta-predictor [J].
Fischer, D .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 51 (03) :434-441