Automated prediction of domain boundaries in CASP6 targets using Ginzu and RosettaDOM

被引:63
作者
Kim, DE [1 ]
Chivian, D [1 ]
Malmström, L [1 ]
Baker, D [1 ]
机构
[1] Univ Washington, Dept Biochem, Seattle, WA 98195 USA
关键词
domain prediction; domain parsing; domain; identification; CASP; CAFASP; Rosetta; Robetta; protein structure prediction; ab initio modeling; de novo modeling; template-based modeling; comparative modeling; homology modeling;
D O I
10.1002/prot.20737
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Domain boundary prediction is an important step in both experimental and computational protein structure characterization. We have developed two fully automated domain parsing methods: the first, Ginzu, which we have described previously, utilizes information from homologous sequences and structures, while the second, RosettaDOM, which has not been described previously, uses only information in the query sequence. Ginzu iteratively assigns domains by homology to structures and sequence families using successively less confident methods. RosettaDOM uses the Rosetta de novo structure prediction method to build three-dimensional models, and then applies Taylor's structure based domain assignment method to parse the models into domains. Domain boundaries observed repeatedly in the models are predicted to be domain boundaries for the protein. Interestingly, RosettaDOM produced quite good domain predictions for proteins of a size typically considered to be beyond the reach of de novo structure prediction methods. For remote fold recognition targets and new folds, both Ginzu and RosettaDOM produced promising results, and in some cases where one method failed to detect the correct domain boundary, it was correctly identified by the other method. We describe here the successes and failures using both methods, and address the possibility of incorporating both protocols into an improved hybrid method.
引用
收藏
页码:193 / 200
页数:8
相关论文
共 16 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[3]   Rosetta predictions in CASP5: Successes, failures, and prospects for complete automation [J].
Bradley, P ;
Chivian, D ;
Meiler, J ;
Misura, KMS ;
Rohl, CA ;
Schief, WR ;
Wedemeyer, WJ ;
Schueler-Furman, O ;
Murphy, P ;
Schonbrun, J ;
Strauss, CEM ;
Baker, D .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 :457-468
[4]   Automated prediction of CASP-5 structures using the Robetta server [J].
Chivian, D ;
Kim, DE ;
Malmström, L ;
Bradley, P ;
Robertson, T ;
Murphy, P ;
Strauss, CEM ;
Bonneau, R ;
Rohl, CA ;
Baker, D .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 :524-533
[5]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[6]   SnapDRAGON: a method to delineate protein structural domains from sequence data [J].
George, RA ;
Heringa, J .
JOURNAL OF MOLECULAR BIOLOGY, 2002, 316 (03) :839-851
[7]   Detection of reliable and unexpected protein fold predictions using 3D-Jury [J].
Ginalski, K ;
Rychlewski, L .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3291-3292
[8]   3D-Jury: a simple approach to improve protein structure predictions [J].
Ginalski, K ;
Elofsson, A ;
Fischer, D ;
Rychlewski, L .
BIOINFORMATICS, 2003, 19 (08) :1015-1018
[9]   FFAS03: a server for profile-profile sequence alignments [J].
Jaroszewski, L ;
Rychlewski, L ;
Li, ZW ;
Li, WZ ;
Godzik, A .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W284-W288
[10]   Protein secondary structure prediction based on position-specific scoring matrices [J].
Jones, DT .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 292 (02) :195-202