Abstract
Structure-based methods in drug discovery have become an integral part of the modern drug discovery process. The power of virtual screening lies in its ability to rapidly and cost-effectively explore enormous chemical spaces to select promising ligands for further experimental investigation. Relative Free Energy Perturbation (RFEP) and similar methods are the gold standard for binding affinity prediction in drug discovery hit-to-lead and lead optimization phases, but have high computational cost and the requirement of a structural analog with a known activity. Without a reference molecule requirement, Absolute FEP (AFEP) has, in theory, better accuracy for hit ID, but in practice, the slow throughput is not compatible with VS, where fast docking and unreliable scoring functions are still the standard. Here, we present an integrated workflow to virtually screen large and diverse chemical libraries efficiently, combining active learning with a physics-based scoring function based on a fast absolute free energy perturbation method. We validated the performance of the approach in the ranking of structurally related ligands, virtual screening hit rate enrichment, and active learning chemical space exploration; disclosing the largest reported collection of free energy simulations to date.