Abstract
Machine learning approaches for conceptualizing and designing in silico compounds have attracted significant attention. However, the applicability of these compounds is often challenged by synthetic viability and cost-effectiveness. Researchers introduced proxy-scores, known as synthethic accessiblity scoring, to quantify the ease of synthesis for virtual molecules. Despite their utility, existing synthetic accessibility tools have notable limitations: they overlook compound purchasability, lack physical interpretability, and often rely on imperfect computer-aided synthesis planning algorithms. We introduce MolPrice, an accurate and fast model for molecular price prediction. Utilizing self-supervised contrastive learning, MolPrice autonomously generates price labels for complex molecules, enabling the model to generalize to molecules beyond the training distribution. Our results show that MolPrice reliably assigns higher prices to complex molecules than to readily purchasable ones, effectively distinguishing different levels of molecular complexity. Furthermore, MolPrice achieves competitive performance on literature benchmarks for synthetic accessibility, matching state-of-the-art methods in comparative evaluations. To demonstrate its practical utility, we conduct a virtual screening case study, illustrating how MolPrice successfully identifies purchasable molecules from a large candidate library. MolPrice bridges the gap between generative molecular design and real-world feasibility by integrating cost-awareness into synthetic accessibility assessment, making it a powerful model to accelerate molecular discovery.
Supplementary materials
Title
Supplementary Information for "MolPrice: Assessing Synthetic Accessibility of Molecules based on Market Value"
Description
Supporting material contains additional methodological details and results, as outlined in the manuscript.
Actions