Abstract
Facing the growing trend of emerging new psychoactive substances (NPS) and their threat to public health, more effective methods of NPS prediction and identification are of critical importance. In this study, we aimed to compare methods for predicting the pharmacological profile of unknown compounds based on the chemical structures described by molecular fingerprints. We built predictive models based on the high-throughput screening (HTS) data sets using four different machine learning algorithms for a total of 10 targets and validated the performance of the models using in vitro bioassay data collected from the literature for an external NPS compound set. Clustering analysis revealed that the MACCS fingerprint may be more suitable for describing the similarity of pharmacological profiles of NPS, indicated by the highest adjusted Rand index (0.46) between the two clustering trees. The SVM classifiers validated by the external NPS set achieved ROC AUC above 0.80 and MCC above 0.45, therefore were used to generate the multi-target pharmacological profiles. The hit rate for retrieving pharmacologically similar compound pairs using MACCS calculated Tanimoto coefficient was below 1.85%, in contrast, the models were more successful in identifying similar compound pairs (MCC = 0.72), which were otherwise considered dissimilar by molecular fingerprints.