Abstract
More sustainable chemical processes require the selection of suitable molecules, which can be supported by computer-aided molecular design (CAMD). CAMD often generates and evaluates molecular structures using genetic algorithms. However, genetic algorithms can suffer from slow convergence, and might yield suboptimal solutions. In response to these challenges, this work presents a method to fine-tune a genetic algorithm for CAMD. The proposed method builds on the COSMO-CAMD framework that utilizes a genetic algorithm for solving optimization-based molecular design problems and COSMO-RS for predicting physical properties of molecules. The key idea of the proposed method is to integrate results from a fast large-scale molecular screening into the molecular design framework, thereby enabling targeted initialization of the genetic algorithm, referred to as warm-start. The proposed method is applied in two case studies to design solvents for extracting gamma-valerolactone and phenol, respectively, from aqueous solutions. Compared to the benchmark method, the warm-started COSMO-CAMD framework reduces computing time by up to 70%, discovers fourfold more top performing candidate molecules, and identifies seven tailored molecular fragments, culminating in the discovery of two novel solvents specifically for the phenol case. The optimal solvent is found in all computational runs. Overall, the warm-started COSMO-CAMD framework significantly improves efficiency, effectiveness, and robustness of molecular design.
Supplementary materials
Title
Supporting Information (general)
Description
It includes additional information about the proposed method, the setup and results of the two case studies discussed, and the software information used in the main article.
Actions
Title
SMILES screening
Description
It includes all molecule inside the COSMO database used in this work.
Actions
Title
SMILES top40 case study 1
Description
It includes the top 40 screened candidate molecules used in case study 1.
Actions
Title
SMILES top40 case study 1
Description
It includes the top 40 screened candidate molecules used in case study 2.
Actions