Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies

被引:485
作者
Hansen, Katja [1 ]
Montavon, Gregoire [2 ]
Biegler, Franziska [2 ]
Fazli, Siamac [2 ]
Rupp, Matthias [3 ]
Scheffler, Matthias [1 ]
von Lilienfeld, O. Anatole [4 ]
Tkatchenko, Alexandre [1 ]
Mueller, Klaus-Robert [2 ,5 ]
机构
[1] Max Planck Gesell, Fritz Haber Inst, Berlin, Germany
[2] TU Berlin, Machine Learning Grp, Berlin, Germany
[3] Swiss Fed Inst Technol, Inst Pharmaceut Sci, Zurich, Switzerland
[4] Argonne Natl Lab, Argonne Leadership Comp Facil, Lemont, IL USA
[5] Korea Univ, Dept Brain & Cognit Engn, Seoul, South Korea
基金
加拿大自然科学与工程研究理事会; 欧洲研究理事会; 新加坡国家研究基金会;
关键词
MIXED-EFFECTS MODELS; DEEP; SURFACES; REGRESSION; SELECTION; BIAS;
D O I
10.1021/ct400195d
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
The accurate and reliable prediction of properties of molecules typically requires computationally intensive quantum-chemical calculations. Recently, machine learning techniques applied to ab initio calculations have been proposed as an efficient approach for describing the energies of molecules in their given ground-state structure throughout chemical compound space (Rupp et al. Phys. Rev. Lett. 2012, 108, 058301). In this paper we outline a number of established machine learning techniques and investigate the influence of the molecular representation on the methods performance. The best methods achieve prediction errors of 3 kcal/mol for the atomization energies of a wide variety of molecules. Rationales for this performance improvement are given together with pitfalls and challenges when applying machine learning approaches to the prediction of quantum-mechanical observables.
引用
收藏
页码:3404 / 3419
页数:16
相关论文
共 78 条
  • [1] Asymptotic statistical theory of overtraining and cross-validation
    Amari, S
    Murata, N
    Muller, KR
    Finke, M
    Yang, HH
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (05): : 985 - 996
  • [2] [Anonymous], 2001, Pattern Classification
  • [3] [Anonymous], 2001, The elements of statistical learning: data mining, inference and prediction
  • [4] Support vector machine regression (LS-SVM)-an alternative to artificial neural networks (ANNs) for the analysis of quantum chemistry data?
    Balabin, Roman M.
    Lomakina, Ekaterina I.
    [J]. PHYSICAL CHEMISTRY CHEMICAL PHYSICS, 2011, 13 (24) : 11710 - 11718
  • [5] Neural network approach to quantum-chemistry data: Accurate prediction of density functional theory energies
    Balabin, Roman M.
    Lomakina, Ekaterina I.
    [J]. JOURNAL OF CHEMICAL PHYSICS, 2009, 131 (07)
  • [6] Gaussian Approximation Potentials: The Accuracy of Quantum Mechanics, without the Electrons
    Bartok, Albert P.
    Payne, Mike C.
    Kondor, Risi
    Csanyi, Gabor
    [J]. PHYSICAL REVIEW LETTERS, 2010, 104 (13)
  • [7] Generalized neural-network representation of high-dimensional potential-energy surfaces
    Behler, Joerg
    Parrinello, Michele
    [J]. PHYSICAL REVIEW LETTERS, 2007, 98 (14)
  • [8] Neural network potential-energy surfaces in chemistry: a tool for large-scale simulations
    Behler, Joerg
    [J]. PHYSICAL CHEMISTRY CHEMICAL PHYSICS, 2011, 13 (40) : 17930 - 17955
  • [9] Metadynamics simulations of the high-pressure phases of silicon employing a high-dimensional neural network potential
    Behler, Joerg
    Martonak, Roman
    Donadio, Davide
    Parrinello, Michele
    [J]. PHYSICAL REVIEW LETTERS, 2008, 100 (18)
  • [10] Learning Deep Architectures for AI
    Bengio, Yoshua
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01): : 1 - 127