Abstract
Visualization of the chemical space is useful in many aspects of chemistry including compound library design, diversity analysis, and exploring structure-property relationships, to name a few. Examples of notable research areas where visualization of chemical space has strong applications are drug discovery and natural product research. However, the sheer volume of even comparatively small sub-sections of chemical space implies that we need to use approximations at the time of navigating through chemical space. ChemMaps is a visualization methodology that approximates the distribution of compounds in large datasets based on the selection of satellite compounds that yield a similar mapping of the whole dataset when principal component analysis on similarity matrix was performed. Here, we show how the recently proposed extended similarity indices can help to find regions that are relevant to sample satellites and reduce the amount of high dimensional data needed to describe a library’s chemical space.