Abstract
Discovering new structures in the chemical space is a long-standing challenge and has important applications to various fields such as chemistry, material science, and drug discovery. Deep generative models have been used in de novo molecule design to embed molecules in a meaningful latent space and then sample new molecules from it. However, the steerability and interpretability of the learned latent space remain much less explored. In this paper, we introduce a new task named molecule manipulation, which aims to align the molecular properties of the generated molecule and its latent activation in order to achieve interactive molecule editing. Then we develop a method called Chemical Space Explorer (ChemSpacE), which identifies and traverses interpretable directions in the latent space that align with molecular structures and property changes. Specifically, ChemSpacE leverages the properties of the learned latent space by generative models and utilizes linear models to identify such directions and thus is highly efficient in terms of training/inference time, data, and the number of oracle calls. Experiments show that ChemSpacE can efficiently steer the latent spaces of multiple state-of-the-art molecule generative models for interactive molecule discovery.