Abstract
Accurate and efficient simulation of the thermodynamics and kinetics of protein-ligand interactions is crucial for computational drug discovery. Multiensemble Markov Model (MEMM) estimators can provide estimates of both binding rates and affinities from collections of short trajectories, but have not been systematically explored for situations when a ligand is decoupled through scaling of non-bonded interactions. In this work, we compare the performance of two MEMM approaches for estimating ligand binding affinities and rates: (1) the transition-based reweighting analysis method (TRAM) and (2) a Maximum Caliber (MaxCal) based method. As a test system, we construct a small host-guest system where the ligand is a single uncharged Lennard-Jones (LJ) particle, and the receptor is an 11-particle icosahedral pocket made from the same atom type. To realistically mimic a protein-ligand binding system, the LJ ε parameter was tuned, and the system placed in a periodic box with 860 TIP3P water molecules. A benchmark was performed using over 80 μs of unbiased simulation, and an 18-state Markov state model used to estimate reference binding affinities and rates. We then tested the performance of TRAM and MaxCal when challenged with limited data. Both TRAM and MaxCal approaches perform better than conventional MSMs, with TRAM showing better convergence and accuracy. We find that subsampling of trajectories to remove time correlation improves the accuracy of both TRAM and MaxCal, and that in most cases only a single biased ensemble to enhance sampled transitions is required to make accurate estimates.