Abstract
Molecular docking, a structure-based virtual screening method, is a reliable tool to enrich bio-active molecules from molecular databases. With the expansion of the size of virtual libraries, the speed of existing molecular docking programs becomes less than adequate to meet the demand for screening ultra-large libraries containing tens of millions or billions of molecules. Here we propose Uni-Dock, a GPU-accelerated molecular docking program supporting various scoring functions including vina, vinardo, and ad4, which achieves more than 1000-fold speed-up with high accuracy compared with the AutoDock Vina single-CPU-core version, outperforming reported GPU-accelerated docking programs including AutoDock-GPU and Vina-GPU. Uni-Dock divides molecules into batches and simultaneously docks batches of molecules using hundreds of concurrent threads for each molecule. The data flow between GPU and CPU is optimized to eliminate CPU hotspots and maximize GPU utility. We demonstrate and analyze the improved performance of Uni-Dock on the CASF-2016 and DUD-E datasets and recommend three combinations of hyperparameters corresponding to different docking scenarios. To demonstrate Uni-Dock's capability on routinely screening ultra-large libraries, we performed hierarchical virtual screening experiments with Uni-Dock on Enamine Diverse REAL drug-like set containing 38.2 million molecules to a popular target KRAS G12D in 12 hours using 100 NVIDIA V100 GPUs.