Abstract
Learning structure-scent relationships is a complex challenge due to both the large chemical space of odorous molecules and the molecular biology of a smell. We empirically fit structure-scent relationships by training an accurate graph neural network and then explaining its predictions. We use counterfactuals and descriptor attribution to generate explanations for the 112 scents in the Leffingwell Odor Dataset (Sanchez-Lengeling et al., 2019). Then we use natural language processing to summarize the quantitative explanations into text. The complete process goes from data to a natural language explanation with the aim of determining structure-scent relationships.