Beyond rotamers: a generative, probabilistic model of side chains in proteins

被引:43
作者
Harder, Tim [1 ]
Boomsma, Wouter [2 ,3 ]
Paluszewski, Martin [1 ]
Frellsen, Jes [1 ]
Johansson, Kristoffer E. [2 ,4 ]
Hamelryck, Thomas [1 ]
机构
[1] Univ Copenhagen, Bioinformat Sect, Dept Biol, Copenhagen, Denmark
[2] Tech Univ Denmark, DTU Elektro, DK-2800 Lyngby, Denmark
[3] Univ Cambridge, Dept Chem, Cambridge CB2 1EW, England
[4] Univ Copenhagen, Sect Biomol Sci, Dept Biol, Copenhagen, Denmark
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
FORCE-FIELD; CONFORMATIONS; ALGORITHM; PREDICTION; ENERGETICS;
D O I
10.1186/1471-2105-11-306
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Accurately covering the conformational space of amino acid side chains is essential for important applications such as protein design, docking and high resolution structure prediction. Today, the most common way to capture this conformational space is through rotamer libraries - discrete collections of side chain conformations derived from experimentally determined protein structures. The discretization can be exploited to efficiently search the conformational space. However, discretizing this naturally continuous space comes at the cost of losing detailed information that is crucial for certain applications. For example, rigorously combining rotamers with physical force fields is associated with numerous problems. Results: In this work we present BASILISK: a generative, probabilistic model of the conformational space of side chains that makes it possible to sample in continuous space. In addition, sampling can be conditional upon the protein's detailed backbone conformation, again in continuous space - without involving discretization. Conclusions: A careful analysis of the model and a comparison with various rotamer libraries indicates that the model forms an excellent, fully continuous model of side chain conformational space. We also illustrate how the model can be used for rigorous, unbiased sampling with a physical force field, and how it improves side chain prediction when used as a pseudo-energy term. In conclusion, BASILISK is an important step forward on the way to a rigorous probabilistic description of protein structure in continuous space and in atomic detail.
引用
收藏
页数:13
相关论文
共 60 条
[1]  
[Anonymous], 2006, Pattern recognition and machine learning
[2]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[3]   A generative, probabilistic model of local protein structure [J].
Boomsma, Wouter ;
Mardia, Kanti V. ;
Taylor, Charles C. ;
Ferkinghoff-Borg, Jesper ;
Krogh, Anders ;
Hamelryck, Thomas .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (26) :8932-8937
[4]  
Burnham KP., 2002, MODEL SELECTION MULT, DOI DOI 10.1007/B97636
[5]   A graph-theory algorithm for rapid protein side-chain prediction [J].
Canutescu, AA ;
Shelenkov, AA ;
Dunbrack, RL .
PROTEIN SCIENCE, 2003, 12 (09) :2001-2014
[6]   Multitask learning [J].
Caruana, R .
MACHINE LEARNING, 1997, 28 (01) :41-75
[7]   HMM sampling and applications to gene finding and alternative splicing [J].
Cawley, Simon L. ;
Pachter, Lior .
BIOINFORMATICS, 2003, 19 :II36-II41
[8]  
CHANDRASEKARAN R, 1970, INT J PROT RES, V2, P223
[9]   Biopython']python: freely available Python']Python tools for computational molecular biology and bioinformatics [J].
Cock, Peter J. A. ;
Antao, Tiago ;
Chang, Jeffrey T. ;
Chapman, Brad A. ;
Cox, Cymon J. ;
Dalke, Andrew ;
Friedberg, Iddo ;
Hamelryck, Thomas ;
Kauff, Frank ;
Wilczynski, Bartek ;
de Hoon, Michiel J. L. .
BIOINFORMATICS, 2009, 25 (11) :1422-1423
[10]   THE DEAD-END ELIMINATION THEOREM AND ITS USE IN PROTEIN SIDE-CHAIN POSITIONING [J].
DESMET, J ;
DEMAEYER, M ;
HAZES, B ;
LASTERS, I .
NATURE, 1992, 356 (6369) :539-542