Abstract
Machine learning models for molecular-property prediction typically work with molecular
representations in the form of fingerprints, descriptors, or graphs. In case of fingerprints
and descriptors, molecular representations usually comprise thousands of features, which
causes the curse of dimensionality for many tabular models. In this work, we introduce
penalized linear models enforcing sparsity on grouped molecular representations. Loosely
speaking, sparsity penalties aim to select a relatively small number of features to improve
the interpretability and computational convenience of machine learning models.
Supplementary weblinks
Title
Python package for "Molecular-Property Prediction with Sparsity"
Description
Implementation details and example scripts.
Actions
View