Abstract
The complexity and size of large molecular systems, such as protein-ligand complexes, pose computational challenges for accurate Post-Hartree-Fock calculations. This study delivers a thorough benchmarking of the Molecules-in-Molecules (MIM) method, presenting a clear and accessible strategy for layer ∕ theory selections in Post-Hartree-Fock computations on substantial molecular systems, notably protein-ligand complexes. An approach is articulated, enabling augmented computational efficiency by strategically canceling out common subsystem energy terms between complexes and proteins within the supermolecular equation. Employing DLPNO-based Post-Hartree-Fock methods in conjunction with the three-layer MIM method (MIM3), the study demonstrates the achievement of protein-ligand binding energies with remarkable accuracy (errors < 1 kcal mol−1), while significantly reducing computational costs. Furthermore, noteworthy correlations between theoretically computed interaction energies and their experimental equivalents were observed, with R^2 values of approximately 0.90 and 0.78 for CDK2 and BZT-ITK sets respectively, thus validating the efficacy of the MIM method in calculating binding energies. By highlighting the crucial role of diffuse or small Pople-style basis sets in the middle layer for reducing energy errors, this work provides valuable insights and practical methodologies for interaction energy computations in large molecular complexes and opens avenues for their application across a diverse range of molecular systems.
Supplementary materials
Title
Manuscript Supporting Information
Description
Bar plots depicting MIM errors with different levels of theory.
Actions