Abstract
The weighted ensemble (WE) algorithm is gaining popularity as a rare event method for studying long timescale processes with molecular dynamics. WE is particularly useful for determining kinetic properties, like rates of protein (un)folding and ligand (un)binding, where transition rates can be calculated from the flux of trajectories into a target basin of interest. However, this flux depends exponentially on the number of splitting events that a given trajectory experiences before reaching the target state, and can vary by orders of magnitude between WE replicates. Markov state models (MSM) are helpful tools to aggregate information across multiple WE simulations and have previously been shown to provide more accurate transition rates than WE alone. Discrete-time MSMs are models that coarsely describe the evolution of the system from one discrete state to the next using a discrete lagtime, τ. When building an MSM using conventional MD data, longer values of τ typically provide more accurate results. Combining WE simulations with Markov state modeling presents some additional challenges, especially when using a value of τ that exceeds the lagtime between resampling steps in the WE algorithm, τ_WE. Here we identify a source of bias that occurs when τ > τ_WE, which we refer to as "merging bias". We also propose an algorithm to eliminate the merging bias, which results in merge-bias corrected MSMs, or "MBC-MSMs". Using a simple model system, as well as a complex biomolecular example, we show that MBC-MSMs significantly outperform both τ=τ_WE MSMs and uncorrected MSMs at longer lagtimes.
Supplementary materials
Title
Supplementary Information
Description
Contains extended methods for the setup of the sEH ligand unbinding system as well as two supplemental tables and three supplemental figures.
Actions
Supplementary weblinks
Title
Github repository for MBC_MSM code
Description
Contains scripts and libraries for building merge-bias corrected Markov state models
Actions
View