Abstract
In drug discovery, the in-silico prediction of binding affinity is one of the major means to prioritize compounds for synthesis. Alchemical relative binding free energy (RBFE) calculations based on molecular dynamics (MD) simulations is nowadays a popular approach for accurate affinity ranking of compounds. MD simulations rely on empirical force field parameters, which strongly influence the accuracy of the predicted affinities. Here, we evaluate the ability of six different small-molecule force fields to predict experimental protein-ligand binding affinities in RBFE calculations on a set of 598 ligands and 22 protein targets. The public force fields OpenFF Parsley and Sage, GAFF and CGenFF show comparable accuracy, while OPLS3e is significantly more accurate. However, a Consensus approach using Sage, GAFF and CGenFF leads to accuracies comparable to OPLS3e. While Parsley and Sage are performing comparable based on aggregated statistics across the whole dataset, there are differences in terms of outliers. Analysis of the force field reveals that improved parameters lead to significant improvement in the accuracy of affinity predictions on subsets of the dataset involving those parameters. Lower accuracy cannot only be attributed to the force field parameters, but is also dependent of input preparation, and sampling convergence of the calculations. Especially large perturbations and non-converged simulations lead to less accurate predictions. The input structures, Gromacs force field files as well as the analysis python notebooks are available on github.
Supplementary materials
Title
Supplementary Information: Current state of open source force fields in protein-ligand binding affinity predictions
Description
The Supplementary Information lists details about the employed target set, shows additional graphs and tables containing aggregated statistics and correlations with experiment in greater detail, and shows various properties of the simulated perturbations.
Actions
Supplementary weblinks
Title
Github repository
Description
Analyses of protein-ligand free energy benchmark calculations. It includes analysis code, information about the data set and additional figures.
Actions
View