Abstract
This study aims at improving upon existing
activity predictions methods by augmenting chemical structure fingerprints with
bio-activity based fingerprints derived from high-throughput screening (HTS)
data (HTSFPs). The HTSFPs were generated from HTS data obtained from PubChem
and combined with an ECFP4 structural fingerprint. The combined experimental
and structural fingerprint (CESFP) was benchmarked against the individual ECFP4
and HTSFP fingerprints. Results showed that the CESFP has improved predictive performance
as well as scaffold hopping capability. The CESFP identified unique compounds compared
to both the ECFP4 and the HTSFP fingerprint indicating synergistic effects
between the two fingerprints. A feature importance analysis showed that a small
subset of the HTSFP features contribute most to the overall performance of the
CESFP. This combined approach allows for activity prediction of compounds with
only sparse HTSFPs due to the supporting effect from the structural fingerprint.
Supplementary materials
Title
Pubchem assay list
Description
Actions
Title
Supplementary Data
Description
Actions