PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models

被引:816
作者
Scherer, Martin K. [1 ]
Trendelkamp-Schroer, Benjamin [1 ]
Paul, Fabian [1 ]
Perez-Hernandez, Guillermo [1 ]
Hoffmann, Moritz [1 ]
Plattner, Nuria [1 ]
Wehmeyer, Christoph [1 ]
Prinz, Jan-Hendrik [1 ]
Noe, Frank [1 ]
机构
[1] Freie Univ, Dept Math & Comp Sci, D-14195 Berlin, Germany
关键词
MOLECULAR-DYNAMICS SIMULATIONS; FREE-ENERGY LANDSCAPE; STATE MODELS; CONFORMATIONAL DYNAMICS; VARIATIONAL APPROACH; HIGH-THROUGHPUT; BIOMOLECULAR DYNAMICS; CRYSTAL-STRUCTURE; LIGAND-BINDING; PROTEIN;
D O I
10.1021/acs.jctc.5b00743
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Markov (state) models (MSMs) and related models of molecular kinetics have recently received a surge of interest as they can systematically reconcile simulation data from either a few long or many short simulations and allow us to analyze the essential metastable structures, thermodynamics, and kinetics of the molecular system under investigation. However, the estimation, validation, and analysis of such models is far from trivial and involves sophisticated and often numerically sensitive methods. In this work we present the open-source Python package PyEMMA (http://pyemma.org) that provides accurate and efficient algorithms for kinetic model construction. PyEMMA can read all common molecular dynamics data formats, helps in the selection of input features, provides easy access to dimension reduction algorithms such as principal component analysis (PCA) and time-lagged independent component analysis (TICA) and clustering algorithms such as k-means, and contains estimators for MSMs, hidden Markov models, and several other models. Systematic model validation and error calculation methods are provided. PyEMMA offers a wealth of analysis functions such that the user can conveniently compute molecular observables of interest. We have derived a systematic and accurate way to coarse-grain MSMs to few states and to illustrate the structures of the metastable states of the system. Plotting functions to produce a manuscript-ready presentation of the results are available. In this work, we demonstrate the features of the software and show new methodological concepts and results produced by PyEMMA.
引用
收藏
页码:5525 / 5542
页数:18
相关论文
共 133 条
[1]   Dihedral angle principal component analysis of molecular dynamics simulations [J].
Altis, Alexandros ;
Nguyen, Phuong H. ;
Hegger, Rainer ;
Stock, Gerhard .
JOURNAL OF CHEMICAL PHYSICS, 2007, 126 (24)
[2]  
[Anonymous], 2007, P 18 ANN ACM SIAM S
[3]  
[Anonymous], BIORXIV
[4]   Bayesian comparison of Markov models of molecular dynamics with detailed balance constraint [J].
Bacallado, Sergio ;
Chodera, John D. ;
Pande, Vijay .
JOURNAL OF CHEMICAL PHYSICS, 2009, 131 (04)
[5]   Conformational Transition in Signal Transduction: Metastable States and Transition Pathways in the Activation of a Signaling Protein [J].
Banerjee, Rahul ;
Yan, Honggao ;
Cukier, Robert I. .
JOURNAL OF PHYSICAL CHEMISTRY B, 2015, 119 (22) :6591-6602
[6]   A MAXIMIZATION TECHNIQUE OCCURRING IN STATISTICAL ANALYSIS OF PROBABILISTIC FUNCTIONS OF MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T ;
SOULES, G ;
WEISS, N .
ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (01) :164-&
[7]   MSMBuilder2: Modeling Conformational Dynamics on the Picosecond to Millisecond Scale [J].
Beauchamp, Kyle A. ;
Bowman, Gregory R. ;
Lane, Thomas J. ;
Maibaum, Lutz ;
Haque, Imran S. ;
Pande, Vijay S. .
JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2011, 7 (10) :3412-3419
[8]   Simple few-state models reveal hidden complexity in protein folding [J].
Beauchamp, Kyle A. ;
McGibbon, Robert ;
Lin, Yu-Shan ;
Pande, Vijay S. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (44) :17807-17813
[9]   Reactive flux and folding pathways in network models of coarse-grained protein dynamics [J].
Berezhkovskii, Alexander ;
Hummer, Gerhard ;
Szabo, Attila .
JOURNAL OF CHEMICAL PHYSICS, 2009, 130 (20)
[10]   g_contacts: Fast contact search in bio-molecular ensemble data [J].
Blau, Christian ;
Grubmuller, Helmut .
COMPUTER PHYSICS COMMUNICATIONS, 2013, 184 (12) :2856-2859