Abstract
We report a high-performance multi graphics processing unit (GPU) implementation of the Kohn-Sham time-dependent density functional theory (TDDFT)
within the Tamm-Dancoff approximation. Our newly developed GPU algorithm on massively parallel computing systems using multiple parallel models in tandem
scales optimally with material size, considerably reducing the computational wall time. A benchmark TDDFT study was performed on a green fluorescent protein complex
composed of 4,353 atoms with 40,518 atomic orbitals represented by Gaussian-type functions. As the largest molecule attempted to date to the best of our knowledge,
the proposed strategy demonstrated reasonably high efficiencies up to 256 GPUs on a custom-built state-of-the-art GPU computing system with Nvidia A100 GPUs. We believe that our GPU-oriented algorithms, which empower first-principles simulation for very large-scale applications,
may render deeper understanding of the molecular basis of material behaviors, eventually revealing new possibilities for breakthrough designs on new material systems.
Supplementary materials
Title
Supporting Texts, Figures and Tables
Description
Detailed mathematical expressions and other details on computational data
Actions