Improved protein structure prediction using potentials from deep learning

被引:1969
作者
Senior, Andrew W. [1 ]
Evans, Richard [1 ]
Jumper, John [1 ]
Kirkpatrick, James [1 ]
Sifre, Laurent [1 ]
Green, Tim [1 ]
Qin, Chongli [1 ]
Zidek, Augustin [1 ]
Nelson, Alexander W. R. [1 ]
Bridgland, Alex [1 ]
Penedones, Hugo [1 ]
Petersen, Stig [1 ]
Simonyan, Karen [1 ]
Crossan, Steve [1 ]
Kohli, Pushmeet [1 ]
Jones, David T. [2 ,3 ]
Silver, David [1 ]
Kavukcuoglu, Koray [1 ]
Hassabis, Demis [1 ]
机构
[1] DeepMind, London, England
[2] Francis Crick Inst, London, England
[3] UCL, London, England
关键词
NEURAL-NETWORKS; CONTACTS; COEVOLUTION; SEQUENCES;
D O I
10.1038/s41586-019-1923-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Protein structure prediction can be used to determine the three-dimensional shape of a protein from its amino acid sequence(1). This problem is of fundamental importance as the structure of a protein largely determines its function(2); however, protein structures can be difficult to determine experimentally. Considerable progress has recently been made by leveraging genetic information. It is possible to infer which amino acid residues are in contact by analysing covariation in homologous sequences, which aids in the prediction of protein structures(3). Here we show that we can train a neural network to make accurate predictions of the distances between pairs of residues, which convey more information about the structure than contact predictions. Using this information, we construct a potential of mean force(4) that can accurately describe the shape of a protein. We find that the resulting potential can be optimized by a simple gradient descent algorithm to generate structures without complex sampling procedures. The resulting system, named AlphaFold, achieves high accuracy, even for sequences with fewer homologous sequences. In the recent Critical Assessment of Protein Structure Prediction(5) (CASP13)-a blind assessment of the state of the field-AlphaFold created high-accuracy structures (with template modelling (TM) scores(6) of 0.7 or higher) for 24 out of 43 free modelling domains, whereas the next best method, which used sampling and contact information, achieved such accuracy for only 14 out of 43 domains. AlphaFold represents a considerable advance in protein-structure prediction. We expect this increased accuracy to enable insights into the function and malfunction of proteins, especially in cases for which no structures for homologous proteins have been experimentally determined(7).
引用
收藏
页码:706 / +
页数:22
相关论文
共 55 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] A further leap of improvement in tertiary structure prediction in CASP13 prompts new routes for future assessments
    Abriata, Luciano A.
    Tamo, Giorgio E.
    Dal Peraro, Matteo
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2019, 87 (12) : 1100 - 1112
  • [3] CORRELATION OF COORDINATED AMINO-ACID SUBSTITUTIONS WITH FUNCTION IN VIRUSES RELATED TO TOBACCO MOSAIC-VIRUS
    ALTSCHUH, D
    LESK, AM
    BLOOMER, AC
    KLUG, A
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1987, 193 (04) : 693 - 707
  • [4] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [5] ESTIMATING POLYPEPTIDE ALPHA-CARBON DISTANCES FROM MULTIPLE SEQUENCE ALIGNMENTS
    ASZODI, A
    TAYLOR, WR
    [J]. JOURNAL OF MATHEMATICAL CHEMISTRY, 1995, 17 (2-3) : 167 - 184
  • [6] GLOBAL FOLD DETERMINATION FROM A SMALL NUMBER OF DISTANCE RESTRAINTS
    ASZODI, A
    GRADWELL, MJ
    TAYLOR, WR
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1995, 251 (02) : 308 - 326
  • [7] Crystal structure of misoprostol bound to the labor inducer prostaglandin E2 receptor
    Audet, Martin
    White, Kate L.
    Breton, Billy
    Zarzycka, Barbara
    Han, Gye Won
    Lu, Yan
    Gati, Cornelius
    Batyuk, Alexander
    Popov, Petr
    Velasquez, Jeffrey
    Manahan, David
    Hu, Hao
    Weierstall, Uwe
    Liu, Wei
    Shui, Wenqing
    Katritch, Vsevolod
    Cherezov, Vadim
    Hanson, Michael A.
    Stevens, Raymond C.
    [J]. NATURE CHEMICAL BIOLOGY, 2019, 15 (01) : 11 - +
  • [8] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [9] Clevert Djork-Arn e, 2015, ARXIV
  • [10] An automatic method for CASP9 free modeling structure prediction assessment
    Cong, Qian
    Kinch, Lisa N.
    Pei, Jimin
    Shi, Shuoyong
    Grishin, Vyacheslav N.
    Li, Wenlin
    Grishin, Nick V.
    [J]. BIOINFORMATICS, 2011, 27 (24) : 3371 - 3378