Abstract
In this work, we benchmark 4 selected open source docking engines for use in the cytochrome P450 protein family. The key enzymes family of phase I metabolism is characterized by a wide variety of accepted substrates due to flexible active site. This work is a benchmark study which aims to evaluate the capabilities of current rigid and induced-fit docking methods for prediction of correct heme-ligand orientation. To asses it, we use two unique distances to heme iron and a SuCOS score to quantify reconstruction of orientation and chemical features. We selected three rigid protein docking engines: GNINA, AutoDock VINA, GalaxyDock2 HEME and a flexible docking model, RosettaFold-All-Atoms to test them on a dataset of 128 CYP-binding ligands.
We report mean absolute error for RosetttaFold-All-Atom on key distance, to the atom closest to heme iron in experimental reference structure, 3 times lower than AutoDock VINA engine in the same simulation. Our results indicate that induced fit method is a significant improvement over rigid methods for flexible active site, but still offer limited predictivity. During crossdocking, RosettaFold-All-Atoms was able to recreate over a quarter of distances up to 20 percent difference from experiment. Further analysis indicates a low overlap in the distribution of ligand chemical features, based on a SuCOS score, which suggests a space for further improvement.