Automatic Prediction of Protein 3D Structures by Probabilistic Multi-template Homology Modeling

被引:95
作者
Meier, Armin [1 ,2 ]
Soeding, Johannes [1 ,2 ]
机构
[1] Max Planck Inst Biophys Chem, Quantitat & Computat Biol, D-37077 Gottingen, Germany
[2] Univ Munich, Gene Ctr, Munich, Germany
关键词
FOLD RECOGNITION; ALIGNMENT;
D O I
10.1371/journal.pcbi.1004343
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Homology modeling predicts the 3D structure of a query protein based on the sequence alignment with one or more template proteins of known structure. Its great importance for biological research is owed to its speed, simplicity, reliability and wide applicability, covering more than half of the residues in protein sequence space. Although multiple templates have been shown to generally increase model quality over single templates, the information from multiple templates has so far been combined using empirically motivated, heuristic approaches. We present here a rigorous statistical framework for multi-template homology modeling. First, we find that the query proteins' atomic distance restraints can be accurately described by two-component Gaussian mixtures. This insight allowed us to apply the standard laws of probability theory to combine restraints from multiple templates. Second, we derive theoretically optimal weights to correct for the redundancy among related templates. Third, a heuristic template selection strategy is proposed. We improve the average GDT-HA model quality score by 11% over single template modeling and by 6.5% over a conventional multi-template approach on a set of 1000 query proteins. Robustness with respect to wrong constraints is likewise improved. We have integrated our multi-template modeling approach with the popular MODELLER homology modeling software in our free HHpred server http://toolkit.tuebingen.mpg.de/hhpred and also offer open source software for running MODELLER with the new restraints at https://bitbucket.org/soedinglab/hh-suite.
引用
收藏
页数:20
相关论文
共 33 条
[1]   WEIGHTS FOR DATA RELATED BY A TREE [J].
ALTSCHUL, SF ;
CARROLL, RJ ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1989, 207 (04) :647-653
[2]  
[Anonymous], CASP 10 EMB C
[3]  
[Anonymous], CASP9 EMB C
[4]  
[Anonymous], PROTEINS
[5]  
[Anonymous], 1994, MIXTURE DENSITY NETW
[6]   A multi-template combination algorithm for protein comparative modeling [J].
Cheng, Jianlin .
BMC STRUCTURAL BIOLOGY, 2008, 8
[7]   An evaluation of automated homology modelling methods at low target-template sequence similarity [J].
Dalton, James A. R. ;
Jackson, Richard M. .
BIOINFORMATICS, 2007, 23 (15) :1901-1908
[8]  
Durbin R., 1998, Biological sequence analysis: probabilistic models of proteins and nucleic acids
[9]   Comparative protein structure modeling by combining multiple templates and optimizing sequence-to-structure alignments [J].
Fernandez-Fuentes, Narcis ;
Rai, Brajesh K. ;
Madrid-Aliste, Carlos J. ;
Fajardo, J. Eduardo ;
Fiser, Andras .
BIOINFORMATICS, 2007, 23 (19) :2558-2565
[10]   Modeling of loops in protein structures [J].
Fiser, A ;
Do, RKG ;
Sali, A .
PROTEIN SCIENCE, 2000, 9 (09) :1753-1773