Abstract
Introduction: Drug-induced photosensitivity is an adverse event of various agents that are used in all major specialties of clinical medicine. Apart from the acute condition, an association of photosensitive events and an increased risk of skin cancer have been repeatedly reported. However, photosensitizing properties of drugs and chemical compounds are also deliberately utilized as a treatment modality, for example as photodynamic therapy in oncology. While certain chemical features have been shown to induce photosensitivity more frequently, the matter is still not conclusively understood and commonly used photobiological assays are discussed to be affected by several limitations. In the present work we investigated the feasibility of predicting photosensitizing effects of drugs and chemical compounds via state-of-the-art artificial intelligence-based workflows.
Methods: A dataset of 2,200 drugs was used to train three distinct models (logistic regression, XGBoost, and a deep learning model) to predict photosensitizing attributes based on the SMILES string. Labels were obtained from a list of previously published photosensitizers resulting in 205 photosensitizing drugs. Data was partitioned using an 80/10/10 training-validation-test split by molecular scaffold. External evaluation of the different models was performed using the tox21 dataset and included a technical interpretation of prediction scores as well as a pharmacological interpretation.
Results: ROC-AUC ranged between 0.8939 (deep learning model) and 0.9525 (XGBoost) during training, while in the test partition it ranged between 0.7785 (deep learning) and 0.7927 (XGBoost). The models were employed to facilitate predictions on the external validation set. Analysis of the top 200 compounds of each model resulted in 55 overlapping molecules. Fifteen of those were fluoroquinolones, a class of commonly reported photosensitizers. Prediction scores in this subset corresponded well with culprit substructures suspected of mediating photosensitizing effects.
Discussion: All three models appeared capable of predicting photosensitizing effects of chemical compounds. However, compared to the simpler model (logistic regression) the complex models (XGBoost and Chemprop) appeared to be more confident in their predictions as exhibited by their distribution of prediction scores. The evaluation of the models on external data further solidified the feasibility of molecular property prediction for photosensitizing abilities. A qualitative analysis of fluoroquinolones in the external dataset based on available photobiological evidence showed that their prediction scores corresponded well with their chemical structure.