Abstract
Most chemical reactions result in numerous by-products and side-products, apart from the intended major product. While chemists can predict many of the main process impurities, it remains a challenge to enumerate the possible minor impurities and even more of a challenge to systematically predict and track impurities derived from raw materials or those that have propagated from one synthetic step to the next. In this study, we developed an AI-assisted approach to predict and track impurities across multi-step reactions using the main reactants, and optionally reagents, solvents and impurities in these materials, as input. We demonstrated the utility of this tool for a simple case of synthesis of paracetamol from phenol, and provide a generalized framework that covers most chemical reactions. Our solution can be applied to enable (1) faster elucidation of impurities, (2) automated interpretation of data generated from high-throughput reaction screening, and (3) more thorough raw materials risk assessments, with each of these representing key workflows in small molecule drug substance commercial process development.