Machine-learning-guided directed evolution for protein engineering

被引:628
作者
Yang, Kevin K. [1 ]
Wu, Zachary [1 ]
Arnold, Frances H. [1 ]
机构
[1] CALTECH, Div Chem & Chem Engn, Pasadena, CA 91125 USA
基金
美国国家科学基金会;
关键词
STABILITY CHANGES; SEQUENCE; MUTATIONS; PREDICTION; KERNEL; MODEL;
D O I
10.1038/s41592-019-0496-6
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Protein engineering through machine-learning-guided directed evolution enables the optimization of protein functions. Machine-learning approaches predict how sequence maps to function in a data-driven manner without requiring a detailed model of the underlying physics or biological pathways. Such methods accelerate directed evolution by learning from the properties of characterized variants and using that information to select sequences that are likely to exhibit improved properties. Here we introduce the steps required to build machine-learning sequence-function models and to use those models to guide engineering, making recommendations at each stage. This review covers basic concepts relevant to the use of machine learning for protein engineering, as well as the current literature and applications of this engineering paradigm. We illustrate the process with two case studies. Finally, we look to future opportunities for machine learning to enable the discovery of unknown protein functions and uncover the relationship between protein sequence and function.
引用
收藏
页码:687 / 694
页数:8
相关论文
共 97 条
  • [1] Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
    Alipanahi, Babak
    Delong, Andrew
    Weirauch, Matthew T.
    Frey, Brendan J.
    [J]. NATURE BIOTECHNOLOGY, 2015, 33 (08) : 831 - +
  • [2] Anand Namrata., 2018, Advances in Neural Information Processing Systems, V31, P7504
  • [3] [Anonymous], 2018, PREPRINT
  • [4] [Anonymous], PREPRINT
  • [5] [Anonymous], PREPRINT
  • [6] [Anonymous], PREPRINT
  • [7] [Anonymous], PREPRINT
  • [8] [Anonymous], PREPRINT
  • [9] [Anonymous], PREPRINT
  • [10] [Anonymous], PREPRINT