Abstract
We introduce the third installment of the COMPAS Project – a COMputational database of Polycyclic Aromatic Systems, focused on peri-condensed polybenzenoid hydrocarbons. In
this installement, we develop two data sets containing the optimized ground-state structures and a selection of molecular properties of ∼39k and ∼9k peri -condensed polybenzenoid hydrocarbons (at the GFN2-xTB and
CAM-B3LYP-D3BJ/cc-pvdz//CAM-B3LYP-D3BJ/def2-SVP levels, respectively). The manuscript details the enumeration and data generation processes and describes the information available within the data sets. An in-depth comparison between the two types of computation is performed, and it is found that the geometric disagreement is maximal for slightly-distorted molecules. In addition, a data-driven analysis of the structure-property trends of peri-condensed PBHs is performed, highlighting the effect of the size of peri-condensed islands and linearly annulated
rings on the HOMO-LUMO gap. The insights described herein are important for rational design of novel functional aromatic molecules for use in, e.g., organic electronics. The generated data sets provide a basis for additional data-
driven machine- and deep-learning studies in chemistry
Supplementary materials
Title
Supporting Information for COMPAS-3
Description
Details of general computational methods, templates for xTB and DFT calculations, benchmarking procedure for choosing the DFT level of theory, comparison of the COMPAS-1 data set using the two levels of theory (this report versus the original publication), extended discussion of the outliers in the aIP and aEA plot, comparison of D3 and D4 dispersion corrections, and additional discussion of the relative energy and structure-property analyses.
Actions
Supplementary weblinks
Title
Repository of the COMPAS Project
Description
All data sets included in the COMPAS Project.
Actions
View