Abstract
Genetic algorithms (GAs) are a powerful tool to search large chemical spaces for inverse molecular design. However, GAs have multiple hyperparameters that have not been thoroughly investigated for chemical space searches. In this work, we examine the general effects of a number of hyperparameters, such as population size, elitism rate, selection method, mutation rate, and convergence criteria, on key GA performance metrics. We show that using a self-termination method with a minimum Spearman's rank correlation coefficient of 0.8 between generations maintained for 50 consecutive generations along with a population size of 32, 50% elitism rate, 3- way tournament selection, and a 40% mutation rate provides the best balance of finding the overall champion, maintaining good coverage of elite targets, and improving relative speedup for general use in molecular design GAs.
Supplementary materials
Title
Supporting Information
Description
Supporting figures and tables in PDF form
Actions