Scientific discovery in the age of artificial intelligence

被引:627
作者
Wang, Hanchen [1 ,2 ,6 ,37 ]
Fu, Tianfan [3 ]
Du, Yuanqi [4 ]
Gao, Wenhao [5 ]
Huang, Kexin [6 ]
Liu, Ziming [7 ]
Chandak, Payal [8 ]
Liu, Shengchao [9 ,10 ]
Van Katwyk, Peter [11 ,12 ]
Deac, Andreea [9 ,10 ]
Anandkumar, Anima [2 ,13 ]
Bergen, Karianne [11 ,12 ]
Gomes, Carla P. [4 ]
Ho, Shirley [14 ,15 ,16 ,17 ]
Kohli, Pushmeet [18 ]
Lasenby, Joan [1 ]
Leskovec, Jure [6 ]
Liu, Tie-Yan [19 ]
Manrai, Arjun [20 ]
Marks, Debora [21 ,22 ]
Ramsundar, Bharath [23 ]
Song, Le [24 ,25 ]
Sun, Jimeng [26 ]
Tang, Jian [9 ,27 ,28 ]
Velickovic, Petar [17 ,29 ]
Welling, Max [30 ,31 ]
Zhang, Linfeng [32 ,33 ]
Coley, Connor W. [5 ,34 ]
Bengio, Yoshua [9 ,10 ]
Zitnik, Marinka [20 ,22 ,35 ,36 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge, England
[2] CALTECH, Dept Comp & Math Sci, Pasadena, CA USA
[3] Georgia Inst Technol, Dept Comp Sci & Engn, Atlanta, GA USA
[4] Cornell Univ, Dept Comp Sci, Ithaca, NY USA
[5] MIT, Dept Chem Engn, Cambridge, MA USA
[6] Stanford Univ, Dept Comp Sci, Stanford, CA USA
[7] MIT, Dept Phys, Cambridge, MA USA
[8] Harvard MIT Program Hlth Sci & Technol, Cambridge, MA USA
[9] Mila Quebec AI Inst, Montreal, PQ, Canada
[10] Univ Montreal, Montreal, PQ, Canada
[11] Brown Univ, Dept Earth Environm & Planetary Sci, Providence, RI USA
[12] Brown Univ, Data Sci Inst, Providence, RI USA
[13] NVIDIA, Santa Clara, CA USA
[14] Flatiron Inst, Ctr Comp Astrophys, New York, NY USA
[15] Princeton Univ, Dept Astrophys Sci, Princeton, NJ USA
[16] Carnegie Mellon Univ, Dept Phys, Pittsburgh, PA USA
[17] New York Univ, Ctr Data Sci, Dept Phys, New York, NY USA
[18] Google DeepMind, London, England
[19] Microsoft Res, Beijing, Peoples R China
[20] Harvard Med Sch, Dept Biomed Informat, Boston, MA 02115 USA
[21] Harvard Med Sch, Dept Syst Biol, Boston, MA USA
[22] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA
[23] Deep Forest Sci, Palo Alto, CA USA
[24] BioMap, Beijing, Peoples R China
[25] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
[26] Univ Illinois, Champaign, IL USA
[27] HEC Montreal, Montreal, PQ, Canada
[28] CIFAR AI Chair, Toronto, ON, Canada
[29] Univ Cambridge, Dept Comp Sci & Technol, Cambridge, England
[30] Univ Amsterdam, Amsterdam, Netherlands
[31] Microsoft Res Amsterdam, Amsterdam, Netherlands
[32] DP Technol, Beijing, Peoples R China
[33] AI Sci Inst, Beijing, Peoples R China
[34] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA USA
[35] Harvard Data Sci Initiat, Cambridge, MA 02138 USA
[36] Harvard Univ, Kempner Inst Study Nat & Artificial Intelligence, Cambridge, MA 02138 USA
[37] Genentech Inc, Dept Res & Early Dev, South San Francisco, CA USA
基金
美国国家卫生研究院;
关键词
NEURAL-NETWORK; PROTEIN DESIGN; DEEP; REPRESENTATION; LANGUAGE; MODELS; IDENTIFICATION; INFORMATION; FRAMEWORK; SYSTEM;
D O I
10.1038/s41586-023-06221-2
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI toolsneed a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation.
引用
收藏
页码:47 / 60
页数:14
相关论文
共 239 条
[71]   A cost-aware framework for the development of AI models for healthcare applications [J].
Erion, Gabriel ;
Janizek, Joseph D. ;
Hudelson, Carly ;
Utarnachitt, Richard B. ;
McCoy, Andrew M. ;
Sayre, Michael R. ;
White, Nathan J. ;
Lee, Su-In .
NATURE BIOMEDICAL ENGINEERING, 2022, 6 (12) :1384-1398
[72]   Controllable protein design with language models [J].
Ferruz, Noelia ;
Hoecker, Birte .
NATURE MACHINE INTELLIGENCE, 2022, 4 (06) :521-532
[73]   Adversarial attacks on medical machine learning [J].
Finlayson, Samuel G. ;
Bowers, John D. ;
Ito, Joichi ;
Zittrain, Jonathan L. ;
Beam, Andrew L. ;
Kohane, Isaac S. .
SCIENCE, 2019, 363 (6433) :1287-1289
[74]  
Finzi M, 2020, PR MACH LEARN RES, V119
[75]  
Finzi Marc, 2021, P MACHINE LEARNING R, V139
[76]   Language models can learn complex molecular distributions [J].
Flam-Shepherd, Daniel ;
Zhu, Kevin ;
Aspuru-Guzik, Alan .
NATURE COMMUNICATIONS, 2022, 13 (01)
[77]   Reducing adverse impacts of Amazon hydropower expansion [J].
Flecker, Alexander S. ;
Shi, Qinru ;
Almeida, Rafael M. ;
Angarita, Hector ;
Gomes-Selman, Jonathan M. ;
Garcia-Villacorta, Roosevelt ;
Sethi, Suresh A. ;
Thomas, Steven A. ;
Poff, N. LeRoy ;
Forsberg, Bruce R. ;
Heilpern, Sebastian A. ;
Hamilton, Stephen K. ;
Abad, Jorge D. ;
Anderson, Elizabeth P. ;
Barros, Nathan ;
Bernal, Isabel Carolina ;
Bernstein, Richard ;
Canas, Carlos M. ;
Dangles, Olivier ;
Encalada, Andrea C. ;
Fleischmann, Ayan S. ;
Goulding, Michael ;
Higgins, Jonathan ;
Jezequel, Celine ;
Larson, Erin, I ;
McIntyre, Peter B. ;
Melack, John M. ;
Montoya, Mariana ;
Oberdorff, Thierry ;
Paiva, Rodrigo ;
Perez, Guillaume ;
Rappazzo, Brendan H. ;
Steinschneider, Scott ;
Torres, Sandra ;
Varese, Mariana ;
Walter, M. Todd ;
Wu, Xiaojian ;
Xue, Yexiang ;
Zapata-Rios, Xavier E. ;
Gomes, Carla P. .
SCIENCE, 2022, 375 (6582) :753-+
[78]  
Fu T., 2021, INT C LEARNING REPRE
[79]  
Fuchs FB, 2020, ADV NEUR IN, V33
[80]   Bayesian parameter estimation using conditional variational autoencoders for gravitational-wave astronomy [J].
Gabbard, Hunter ;
Messenger, Chris ;
Heng, Ik Siong ;
Tonolini, Francesco ;
Murray-Smith, Roderick .
NATURE PHYSICS, 2022, 18 (01) :112-+