Abstract
In-silico methods are increasingly becoming reliable tools to replicate and extend from experimental findings of chemical biodegradability. Information derived from quantitative activity-structure relationships (QSARs) have the potential to have rules extracted that can aid the understanding of biodegradation. Using semi-empirical quantum chemical calculations, the use of a conformer-based augmentation approach, along with dimensionality reduction methods, was studied in the context of achieving improved model accuracy and applicability. This work highlights molecular features, from graph-based features, 3-dimensional structural descriptors, to direct graph-based learning methods, that can be used to distinguish readily biodegradable compounds, and the role of unsupervised pre-processing in refining the training set and choice of features.
Supplementary materials
Title
Supplementary Info
Description
Figure S1-S4, Tables S1-S3, Classification data.
Actions