GDB-9-Ex and ORNL_AISD-Ex: Two open-source datasets for quantum chemical UV-vis electronic excitation spectra of organic molecules

07 March 2023, Version 2
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

We present two open-source datasets that provide time-dependent density-functional tight-binding (TD-DFTB) electronic excitation spectra of organic molecules. These datasets represent predictions of UV-vis absorption spectra performed on optimized geometries of the molecules in their electronic ground state. The GDB-9-Ex dataset contains a subset of 96,766 organic molecules from the original open-source GDB-9 dataset. The ORNL_AISD-Ex dataset was created from GDB-9 molecular structures using a generative algorithm and consists of 10,502,904 organic molecules that contain between 5 and 71 non-hydrogen atoms. The data reveals the close correlation between the magnitude of the gaps between the highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO), and the excitation energy of the lowest singlet excited state energies quantitatively. The chemical variability of the large number of molecules was examined with a topological fingerprint estimation based on extended-connectivity fingerprints (ECFPs) followed by uniform manifold approximation and projection (UMAP) for dimension reduction. Both datasets were generated using a high-throughput workflow that used the DFTB+ software on the Andes' cluster of the Oak Ridge Leadership Computing Facility (OLCF) at Oak Ridge National Laboratory (ORNL).

Keywords

Electronic Excitation
Ultraviolet-Visible Spectroscopy
Organic Molecules
Quantum Chemistry
Time-Dependent Density-Functional Tight-Binding
High-Performance Computing

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.