Drug-target affinity prediction using applicability domain based on data density

Shunya Sugita; Masahito Ohue

doi:10.26434/chemrxiv-2021-hp2p9-v2

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Drug-target affinity prediction using applicability domain based on data density

06 August 2021, Version 2

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

In the pursuit of research and development of drug discovery, the computational prediction of the target affinity of a drug candidate is useful for screening compounds at an early stage and for verifying the binding potential to an unknown target. The chemogenomics-based method has attracted increased attention as it integrates information pertaining to the drug and target to predict drug-target affinity (DTA). However, the compound and target spaces are vast, and without sufficient training data, proper DTA prediction is not possible. If a DTA prediction is made in this situation, it will potentially lead to false predictions. In this study, we propose a DTA prediction method that can advise whether/when there are insufficient samples in the compound/target spaces based on the concept of the applicability domain (AD) and the data density of the training dataset. AD indicates a data region in which a machine learning model can make reliable predictions. By preclassifying the samples to be predicted by the constructed AD into those within (In-AD) and those outside the AD (Out-AD), we can determine whether a reasonable prediction can be made for these samples. The results of the evaluation experiments based on the use of three different public datasets showed that the AD constructed by the k-nearest neighbor (k-NN) method worked well, i.e., the prediction accuracy of the samples classified by the AD as Out-AD was low, while the prediction accuracy of the samples classified by the AD as In-AD was high.

Keywords

drug-target affinity (DTA) prediction

applicability domain (AD)

chemogenomics

data density

k-nearest neighbor (k-NN)

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Aug 06, 2021 Version 2

Apr 29, 2021 Version 1

Version Notes

Author's final version of the manuscript accepted for the IEEE CIBCB 2021 conference. © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Metrics

1,683

596

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2021-hp2p9-v2

Funding

KAKENHI (Grant No. 20H04280) from the Japan Society for the Promotion of Science (JSPS)

KAKENHI (Grant No. 20H04280) from the Japan Society for the Promotion of Science (JSPS)

ACT-X (Grant No. JPMJAX20A3) from the Japan Science and Technology Agency (JST)

ACT-X (Grant No. JPMJAX20A3) from the Japan Science and Technology Agency (JST)

The Platform Project for Supporting Drug Discovery and Life Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research (BINDS)) (Grant No. JP20am0101112) from the Japan Agency for Medical Research and Development (AMED)

the Platform Project for Supporting Drug Discovery and Life Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research (BINDS)) (Grant No. JP20am0101112) from the Japan Agency for Medical Research and Development (AMED)

Author’s competing interest statement

The authors declare no competing interests.

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Drug-target affinity prediction using applicability domain based on data density

Authors

Abstract

Keywords

Comments

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share