Trading accuracy for speed: A quantitative comparison of search algorithms in protein sequence design

被引:139
作者
Voigt, CA
Gordon, DB
Mayo, SL
机构
[1] CALTECH, Howard Hughes Med Inst, Pasadena, CA 91125 USA
[2] CALTECH, Div Biol, Pasadena, CA 91125 USA
[3] CALTECH, Biochem Opt Div Biol, Pasadena, CA 91125 USA
[4] CALTECH, Biochem Opt Div Chem, Pasadena, CA 91125 USA
[5] CALTECH, Biochem Opt Div Chem Engn, Pasadena, CA 91125 USA
[6] CALTECH, Div Chem & Chem Engn, Pasadena, CA 91125 USA
关键词
protein design; combinatorial optimization;
D O I
10.1006/jmbi.2000.3758
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Finding the minimum energy amino acid side-chain conformation is a fundamental problem in both homology modeling and protein design. To address this issue, numerous computational algorithms have been proposed. However, there have been few quantitative comparisons between methods and there is very little general understanding of the types of problems that are appropriate for each algorithm. Here, we study four common search techniques: Monte Carlo (MC) and Monte Carlo plus quench (MCQ); genetic algorithms (GA); self-consistent mean field (SCMF); and dead-end elimination (DEE). Both SCMF and DEE are deterministic, and if DEE converges, it is guaranteed that its solution is the global minimum energy conformation (GMEC). This provides a means to compare the accuracy of SCMF and the stochastic methods. For the sidechain placement calculations, we find that DEE rapidly converges to the GMEC in all the test cases. The other algorithms converge on significantly incorrect solutions; the average fraction of incorrect rotamers for SCMF is 0.12, GA 0.09, and MCQ 0.05. For the protein design calculations, design positions are progressively added to the side-chain placement calculation until the time required for DEE diverges sharply. As the complexity of the problem increases, the accuracy of each method is determined so that the results can be extrapolated into the region where DEE is no longer tractable. We find that both SCMF and MCQ perform reasonably well on core calculations (fraction amino acids incorrect is SCMF 0.07, MCQ 0.04), but fail considerably on the boundary (SCMF 0.28, MCQ 0.32) and surface calculations (SCMF 0.37, MCQ 0.44). (C) 2000 Academic Press.
引用
收藏
页码:789 / 803
页数:15
相关论文
共 51 条
[21]   High-resolution protein design with backbone freedom [J].
Harbury, PB ;
Plecs, JJ ;
Tidor, B ;
Alber, T ;
Kim, PS .
SCIENCE, 1998, 282 (5393) :1462-1467
[22]   OPTIMAL SEQUENCE SELECTION IN PROTEINS OF KNOWN STRUCTURE BY SIMULATED EVOLUTION [J].
HELLINGA, HW ;
RICHARDS, FM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (13) :5803-5807
[23]  
HOLLAND JH, 1993, ADAPTATION NATURAL A
[24]   FAST AND SIMPLE MONTE-CARLO ALGORITHM FOR SIDE-CHAIN OPTIMIZATION IN PROTEINS - APPLICATION TO MODEL-BUILDING BY HOMOLOGY [J].
HOLM, L ;
SANDER, C .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1992, 14 (02) :213-223
[25]  
JONES DT, 1994, PROTEIN SCI, V3, P567
[26]   Mean-field minimization methods for biological macromolecules [J].
Koehl, P ;
Delarue, M .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1996, 6 (02) :222-226
[27]   A SELF-CONSISTENT MEAN-FIELD APPROACH TO SIMULTANEOUS GAP CLOSURE AND SIDE-CHAIN POSITIONING IN HOMOLOGY MODELING [J].
KOEHL, P ;
DELARUE, M .
NATURE STRUCTURAL BIOLOGY, 1995, 2 (02) :163-170
[28]   APPLICATION OF A SELF-CONSISTENT MEAN-FIELD THEORY TO PREDICT PROTEIN SIDE-CHAINS CONFORMATION AND ESTIMATE THEIR CONFORMATIONAL ENTROPY [J].
KOEHL, P ;
DELARUE, M .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 239 (02) :249-275
[29]   PREDICTION OF PROTEIN SIDE-CHAIN CONFORMATIONS FROM LOCAL 3-DIMENSIONAL HOMOLOGY RELATIONSHIPS [J].
LAUGHTON, CA .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 235 (03) :1088-1097
[30]   De novo design of the hydrophobic core of ubiquitin [J].
Lazar, GA ;
Desjarlais, JR ;
Handel, TM .
PROTEIN SCIENCE, 1997, 6 (06) :1167-1178