First report of q-RASAR modeling towards an approach of easy interpretability and efficient transferability

Arkaprava Banerjee; Kunal Roy

doi:10.26434/chemrxiv-2022-0qclt

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

First report of q-RASAR modeling towards an approach of easy interpretability and efficient transferability

18 April 2022, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Quantitative structure-activity relationship (QSAR) and read-across techniques have recently been merged into a new emerging field of Read-across Structure-Activity Relationship (RASAR) that uses the chemical similarity concepts of read-across (an unsupervised step) and finally develops a supervised learning model (like QSAR). The RASAR method has so far been used only in the case of graded predictions or classification modeling. In this work, we attempt, for the first time, to apply RASAR for quantitative predictions (q-RASAR) using a case study of androgen receptor binding affinity data. We have computed a number of error-based and similarity-based measures such as weighted standard deviation of the predicted values, coefficient of variation of the computed predictions, average similarity level of close training compounds for each query molecule, standard deviation and coefficient of variation of similarity levels, maximum similarity levels to positive and negative close training compounds, a concordance measure indicating similarity to positive, negative or both classes of close training compounds, etc. We have clubbed these additional measures along with the selected chemical descriptors from the previously developed QSAR model and redeveloped new partial least squares (PLS) models from the training set, and predicted the endpoint using the query data set. Interestingly, these new models outperform the internal and external validation quality of the original QSAR model. In this study, we have also introduced a new similarity-based concordance measure that can significantly contribute to the model quality. A q-RASAR model also has the advantage over read-across predictions in providing easy interpretation and indicating quantitative contributions of important chemical features. The strategy described here should be applicable to other biological/toxicological/property data modeling for enhanced quality of predictions, easy interpretability, and efficient transferability.

Keywords

Supplementary materials

Title

Description

Actions

Title

Data files

Description

The .zip folder contains the original data set used for modeling with the SMILES of the compounds along with observed receptor binding affinity, the data files for best subset regression and intelligent consensus predictions, and all the reported models in the Excel format.

Actions

Title

Supplementary Plots file

Description

The file contains score plots, applicability domain plots, and randomization plots of individual PLS models

Actions

Supplementary weblinks

Title

Description

Actions

Title

Read-Across v4.0

Description

The read-across tool for computing read-across predictions along with several error and similarity measures is available for download from this link

Actions

View

Title

DTC Lab tools site

Description

DTC Lab tools like best subset selection, PLS regression, and intelligent consensus predictions are available from download from this link

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

First report of q-RASAR modeling toward an approach of easy interpretability and efficient transferability

Arkaprava Banerjee, Kunal Roy journal article

Molecular Diversity

Online publication date: Jun 29, 2022

Version History

Apr 18, 2022 Version 1

Metrics

1,056

468

Views

Downloads

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2022-0qclt

Funding

Science and Engineering Research Board

MTR/2019/000008

Jadavpur University

Not available

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

First report of q-RASAR modeling towards an approach of easy interpretability and efficient transferability

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Now Published

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share