Abstract
Host-guest binding, despite the relatively simple structural and chemical features of individual components, still poses a challenge in computational modelling. The problems lie in both the accuracy of the employed Hamiltonian (often fixed-charge force fields) and the exhaustiveness of conformational sampling. End-point free energy calculations as fast alternatives to rigorous but costly methods are widely applied in virtual screening in protein-ligand and host-guest systems. However, the extreme underperformance of standard end-point methods makes them practically useless. Modifications of the end-point procedure could bring these methods back to the pool of usable tools, e.g., regression considered in our previous work. In the current work, we explore a potentially promising modification, the three-trajectory realization of the end-point simulation protocol. The alteration couples the binding-induced structural reorganization into free energy estimation and suffers from dramatic fluctuations of internal energies in protein-ligand situations. Fortunately, the relatively small size of host-guest systems minimizes the magnitude of internal fluctuations and makes the three-trajectory realization practically suitable. Due to the incorporation of intra-molecular interactions in free energy estimation, a strong dependence on the force field parameters could be incurred. Thus, a term-specific investigation of transferable GAFF derivatives is presented, and noticeable differences in many aspects are identified between commonly applied GAFF and GAFF2. These force-field differences lead to different dynamic behaviors of the macrocyclic host, which ultimately would influence the end-point sampling and binding thermodynamics. Therefore, the three-trajectory end-point free energy calculations are performed with both GAFF versions to investigate the force-field dependent behavior of computed binding affinities. Also, due to the noticeable differences between host dynamics under GAFF and GAFF2, we add additional benchmarks of the single-trajectory end-point calculations. Numerical results suggest that the single-trajectory realization, regardless of the GAFF version, is still not useful in host-guest binding, although the prediction quality of the GAFF2 parameter set is slightly better than GAFF. As for the three-trajectory realization, the absolute values of computed binding thermodynamics exhibit pronounced force-field-dependent behaviors, which are less significant for ranking information. When only the ranks of binding affinities are pursued, the three-trajectory realization performs very well, comparable to and even better than the regressed PBSA_E scoring function and the dielectric-constant-variable regime. With the GAFF parameter set, the TIP3P water in explicit-solvent sampling and either PB or GB implicit-solvent model in free energy estimation, the predictive power of the three-trajectory realization in ranking calculations surpasses all existing end-point methods on this dataset. We further combine the three-trajectory realization with another promising modified end-point regime of varying the interior dielectric constant. The predicted binding affinities exhibit monotonic responses to the variation of the internal dielectric constant, but the deviations from experiment exhibit non-monotonic variations, which are related to the systematic overestimation of the binding strength under the original three-trajectory realization. By contrast, the combined regime does not incur sizable improvements for ranks, although for most systems the dielectric constant 2 seems to be the best option.