Sampling Bottlenecks in De novo Protein Structure Prediction

被引:89
作者
Kim, David E. [1 ]
Blum, Ben [2 ]
Bradley, Philip [3 ]
Baker, David [1 ]
机构
[1] Univ Washington, Howard Hughes Med Inst, Dept Biochem, Seattle, WA 98195 USA
[2] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94305 USA
[3] Fred Hutchinson Canc Res Ctr, Program Computat Biol, Seattle, WA 98109 USA
关键词
protein structure prediction; Rosetta; full-atom refinement; distributed computing; HIGH-RESOLUTION; SECONDARY STRUCTURE; TRANSITION-STATE; ROSETTA; CLASSIFICATION; RECOGNITION; REFINEMENT; FEATURES; PROGRESS; CRYSTAL;
D O I
10.1016/j.jmb.2009.07.063
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The primary obstacle to de novo protein structure prediction is conformational sampling: the native state generally has lower free energy than nonnative structures but is exceedingly difficult to locate. Structure predictions with atomic level accuracy have been made for small proteins using the Rosetta structure prediction method, but for larger and more complex proteins, the native state is virtually never sampled, and it has been unclear how much of an increase in computing power would be required to successfully predict the structures of such proteins. In this paper, we develop an approach to determining how much computer power is required to accurately predict the structure of a protein, based on a reformulation of the conformational search problem as a combinatorial sampling problem in a discrete feature space. We find that conformational sampling for many proteins is limited by critical "linchpin" features, often the backbone torsion angles of individual residues, which are sampled very rarely in unbiased trajectories and, when constrained, dramatically increase the sampling of the native state. These critical features frequently occur in less regular and likely strained regions of proteins that contribute to protein function. In a number of proteins, the linchpin features are in regions found experimentally to form late in folding, suggesting a correspondence between folding in silico and in reality. Published by Elsevier Ltd.
引用
收藏
页码:249 / 260
页数:12
相关论文
共 29 条
[1]  
BLUM B, 2008, NIPS, V20, P137
[2]   Toward high-resolution de novo structure prediction for small proteins [J].
Bradley, P ;
Misura, KMS ;
Baker, D .
SCIENCE, 2005, 309 (5742) :1868-1871
[3]   Free modeling with Rosetta in CASP6 [J].
Bradley, P ;
Malmström, L ;
Qian, B ;
Schonbrun, J ;
Chivian, D ;
Kim, DE ;
Meiler, K ;
Misura, KMS ;
Baker, D .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 61 :128-134
[4]   Improved beta-protein structure prediction by multilevel optimization of NonLocal strand pairings and local backbone conformation [J].
Bradley, Philip ;
Baker, David .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2006, 65 (04) :922-929
[5]   Cyclic coordinate descent: A robotics algorithm for protein loop closure [J].
Canutescu, AA ;
Dunbrack, RL .
PROTEIN SCIENCE, 2003, 12 (05) :963-972
[6]   IDENTIFICATION, CLASSIFICATION, AND ANALYSIS OF BETA-BULGES IN PROTEINS [J].
CHAN, AWE ;
HUTCHINSON, EG ;
HARRIS, D ;
THORNTON, JM .
PROTEIN SCIENCE, 1993, 2 (10) :1574-1590
[7]   Structure prediction for CABP7 targets using extensive all-atom refinement with Rosetta@home [J].
Das, Rhiju ;
Bin Qian ;
Raman, Srivatsan ;
Vernon, Robert ;
Thompson, James ;
Bradley, Philip ;
Khare, Sagar ;
Tyka, Michael D. ;
Bhat, Divya ;
Chivian, Dylan ;
Kim, David E. ;
Sheffler, William H. ;
Malmstrom, Lars ;
Wollacott, Andrew M. ;
Wang, Chu ;
Andre, Ingemar ;
Baker, David .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2007, 69 :118-128
[8]   Protein secondary structure prediction based on position-specific scoring matrices [J].
Jones, DT .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 292 (02) :195-202
[9]   DICTIONARY OF PROTEIN SECONDARY STRUCTURE - PATTERN-RECOGNITION OF HYDROGEN-BONDED AND GEOMETRICAL FEATURES [J].
KABSCH, W ;
SANDER, C .
BIOPOLYMERS, 1983, 22 (12) :2577-2637
[10]   Design of a novel globular protein fold with atomic-level accuracy [J].
Kuhlman, B ;
Dantas, G ;
Ireton, GC ;
Varani, G ;
Stoddard, BL ;
Baker, D .
SCIENCE, 2003, 302 (5649) :1364-1368