Abstract
The structure-property relationships of polybenzenoid hydrocarbons (PBHs) were investigated with interpretable machine learning, for which two new tools were developed and applied. First, a novel textual molecular representation, based on the annulation sequence of PBHs was defined and developed. This representation can be used either in its textual form or as a basis for a curated feature-vector; both forms show improved interpretability over the standard SMILES representation, and the former also has increased predictive accuracy. Second, the recently-developed model, CUSTODI, was applied for the first time as an interpretable model and identified important structural features that impact various electronic molecular properties. The resulting insights not only validate several well-known “rules of thumb” of organic chemistry but also reveal new behaviors and influential structural motifs, thus providing guiding principles for rational design and fine-tuning of PBHs.
Supplementary materials
Title
Supporting Information for LALAS paper
Description
Details of computational methods and model construction. Full fit results. Additional comparison to other models.
Actions
Supplementary weblinks
Title
Repository for LALAS paper
Description
All data and code used in the described work.
Actions
View