Abstract
Density functional theory is known as the workhorse of computational quantum chemistry. One of its main limitations, if not the main one, is that choosing the right functional to employ is a non-trivial task left for human experts. The choice is particularly hard for excited states calculations, when using the time-dependent formulation of DFT (TD-DFT). This is not only due to the approximations and limitations of the method, but also because the photophysical properties of a molecule are defined by a manifold of states that all need to be properly described in a balanced manner to obtain an accurate photochemical picture. This includes not only the relative energy of the states, but also capturing the correct character, order and intensity of the transitions. In this work, we developed a scoring system to quantitatively define the accuracy of an excited states calculation by simultaneously considering at the same time all these properties of a manifold of states. The scoring system is generalizable to any level of theory, we here applied it to a large dataset of organic molecules, calculating 38 scores for as many common functionals of different type and rung, against a higher accuracy method. We used these scores to train a graph attention neural network that is used to predict the 38 scores for molecules represented as 2D graphs. We call this oracle DELFI (Data-driven EvaLuation of Functionals by Inference), which can be used to predict the ranking of functionals to calculate optical properties of organic molecules. A corresponding web application allows to easily run DELFI and analyze the results, alleviating the hurdle of choosing the right functional for TD-DFT calculations.