Abstract
This study developed and implemented a semi-automatic material exploration scheme to modelize the solvent-solubility of tetraphenylporphyrin derivatives. In particular, the scheme involved the following steps: definition of a practical chemical search space, prioritization of molecules in the space using an extended algorithm for submodular function maximization without requiring biased variable selection or pre-existing data, synthesis & automatic measurement, and machine-learning model estimation. The optimal evaluation order selected using the algorithm covered several similar molecules (32% of all targeted molecules, whereas that obtained by random sampling and uncertainty sampling was ~7% and ~4%, respectively) with a small number of evaluations (10 molecules: 0.13% of all targeted molecules). The derived binary classification models predicted ‘good solvents’ with an accuracy > 0.8. Overall, we confirmed the effectivity of the proposed semi-automatic scheme in early-stage material search projects for accelerating a wider range of material research.
Supplementary materials
Title
Supplementary Information 1
Description
1. Substitution target molecules
2. Mapping molecules over principal components
3. Spectrum analysis
4. Time-dependent density functional theory (TDDFT) calculation
5. Solvent-solubility predictions
6. Threshold used in SFMMOL
7. Comparison of algorithms for prediction performances of calculated properties
8. Preparation of the top-ranked TPP derivatives 11–15
Actions
Title
Supplementary Information 2
Description
1. Molecular groups
2. Spectrum data
Actions