Engineering proteinase K using machine learning and synthetic genes

被引:79
作者
Liao, Jun
Warmuth, Manfred K.
Govindarajan, Sridhar
Ness, Jon E.
Wang, Rebecca P.
Gustafsson, Claes
Minshull, Jeremy
机构
[1] DNA 20, Menlo Pk, CA 94025 USA
[2] Univ Calif Santa Cruz, Dept Comp Sci, Santa Cruz, CA 95064 USA
关键词
D O I
10.1186/1472-6750-7-16
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Altering a protein's function by changing its sequence allows natural proteins to be converted into useful molecular tools. Current protein engineering methods are limited by a lack of high throughput physical or computational tests that can accurately predict protein activity under conditions relevant to its final application. Here we describe a new synthetic biology approach to protein engineering that avoids these limitations by combining high throughput gene synthesis with machine learning-based design algorithms. Results: We selected 24 amino acid substitutions to make in proteinase K from alignments of homologous sequences. We then designed and synthesized 59 specific proteinase K variants containing different combinations of the selected substitutions. The 59 variants were tested for their ability to hydrolyze a tetrapeptide substrate after the enzyme was first heated to 68 C for 5 minutes. Sequence and activity data was analyzed using machine learning algorithms. This analysis was used to design a new set of variants predicted to have increased activity over the training set, that were then synthesized and tested. By performing two cycles of machine learning analysis and variant design we obtained 20-fold improved proteinase K variants while only testing a total of 95 variant enzymes. Conclusion: The number of protein variants that must be tested to obtain significant functional improvements determines the type of tests that can be performed. Protein engineers wishing to modify the property of a protein to shrink tumours or catalyze chemical reactions under industrial conditions have until now been forced to accept high throughput surrogate screens to measure protein properties that they hope will correlate with the functionalities that they intend to modify. By reducing the number of variants that must be tested to fewer than 100, machine learning algorithms make it possible to use more complex and expensive tests so that only protein properties that are directly relevant to the desired application need to be measured. Protein design algorithms that only require the testing of a small number of variants represent a significant step towards a generic, resource-optimized protein engineering process.
引用
收藏
页数:19
相关论文
共 64 条
[41]   A probabilistic active support vector learning algorithm [J].
Mitra, P ;
Murthy, CA ;
Pal, SK .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (03) :413-418
[42]   TOTAL SYNTHESIS AND CLONING OF A GENE CODING FOR THE RIBONUCLEASE-S PROTEIN [J].
NAMBIAR, KP ;
STACKHOUSE, J ;
STAUFFER, DM ;
KENNEDY, WP ;
ELDREDGE, JK ;
BENNER, SA .
SCIENCE, 1984, 223 (4642) :1299-1301
[43]  
Ness JE, 2005, ACS SYM SER, V900, P37
[44]   Synthetic shuffling expands functional protein diversity by allowing amino acids to recombine independently [J].
Ness, JE ;
Kim, S ;
Gottman, A ;
Pak, R ;
Krebber, A ;
Borchert, TV ;
Govindarajan, S ;
Mundorff, EC ;
Minshull, J .
NATURE BIOTECHNOLOGY, 2002, 20 (12) :1251-1255
[45]  
Norinder U, 1997, J PEPT RES, V49, P155
[46]   Totally in vitro protein selection using mRNA-protein fusions and ribosome display [J].
Roberts, RW .
CURRENT OPINION IN CHEMICAL BIOLOGY, 1999, 3 (03) :268-273
[47]   Recent progress in biomolecular engineering [J].
Ryu, DDY ;
Nam, DH .
BIOTECHNOLOGY PROGRESS, 2000, 16 (01) :2-16
[48]  
SANDBERG M, 1997, DEPT ORGANIC CHEM UM
[49]   Kinetic characterization and inhibition of the rat MAB elastase-2, an angiotensin I-converting serine protease [J].
Santos, CF ;
Paula, CA ;
Salgado, MCO ;
Oliveira, EB .
CANADIAN JOURNAL OF PHYSIOLOGY AND PHARMACOLOGY, 2002, 80 (01) :42-47
[50]   Key substrate recognition residues in the active site of a plant cytochrome P450, CYP73A1 - Homology model guided site-directed mutagenesis [J].
Schoch, GA ;
Attias, R ;
Le Ret, M ;
Werck-Reichhart, D .
EUROPEAN JOURNAL OF BIOCHEMISTRY, 2003, 270 (18) :3684-3695