Keep looking at the negative side: improved detection of drug-induced liver injury with non-hepatotoxicant data oversampling

Olivier JM BÉQUIGNON; Steven Wink; Steven W. Hiemstra; J. E. Fokkelman-Klip; Wouter den Hollander; Giulia Callegaro; Bob van de Water; Gerard J. P. van Westen

doi:10.26434/chemrxiv-2025-xdc1q

Biological and Medicinal Chemistry

Search within Biological and Medicinal Chemistry

Keep looking at the negative side: improved detection of drug-induced liver injury with non-hepatotoxicant data oversampling

16 April 2025, Version 1

This is not the most recent version. There is a

newer version

of this content available

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Drug-induced liver injury (DILI) presents a critical challenge in drug development, often leading to the withdrawal of promising therapeutic candidates. Traditional predictive models for DILI, typically relying on molecular descriptors and pharmacokinetic properties, are insufficient due to the complex and multifactorial nature of liver toxicity. This complexity stems from overlapping biological stress responses activated by both hepatotoxic and non-hepatotoxic compounds, making it difficult to distinguish between them accurately. Additionally, the scarcity of DILI-positive compounds in available datasets results in significant class imbalance, further limiting the efficacy of conventional predictive models. Addressing these challenges requires novel approaches incorporating molecular and bioactivity data to enhance predictive power. In this study, we developed a custom oversampling strategy tailored to handle DILI's biological complexity and class imbalance. We integrated stress pathway activations, particularly focusing on the oxidative, unfolded protein, DNA damage, heat shock, and cytokine signalling stress responses, with molecular descriptors and bioactivity profiles to improve model performance. The custom oversampling technique demonstrated improved specificity and overall predictive accuracy, mitigating the effects of class imbalance without overfitting. Despite these advances, significant challenges remain in refining predictive models, particularly in identifying the most informative biological markers and optimising experimental protocols for better data acquisition. Our results suggest that while incorporating diverse data types and novel oversampling strategies improves DILI prediction, further efforts are required to create robust, generalisable models capable of reliably predicting hepatotoxicity in the drug development process.

Keywords

Drug-induced liver injury (DILI)

hepatotoxicity prediction

high content screening

stress pathway biomarkers

class imbalance

molecular descriptors

bioactivity spectra

predictive modelling

custom oversampling

Supplementary materials

Title

Description

Actions

Title

Supplementary File 1

Description

File containing the summarised high-content-screening data, molecular descriptors, and labels this manuscript is based upon.

Actions

Title

Supplementary Tables

Description

Supplementary Tables referred to in the manuscript.

Actions

Supplementary weblinks

Title

Description

Actions

Title

Data and results archive

Description

Zenodo deposition of the data required to obtain the results presented in the manuscript, together with the obtained results.

Actions

View

Title

Python code used to obtain the results from the archived data

Description

Link to the GitHub repository hosting the code that generated the results presented in the manuscript.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Apr 18, 2025 Version 2

Apr 16, 2025 Version 1

Metrics

217

Views

Downloads

Citations

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2025-xdc1q

Funding

IMI eTRANSAFE project

777365

Horizon2020 projects EU-ToxRisk

681002

RISK-HUNT3R

964537

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

Keep looking at the negative side: improved detection of drug-induced liver injury with non-hepatotoxicant data oversampling

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Version History

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share