Abstract
The application of machine learning models in chemistry has made remarkable strides in recent years. Even though there is considerable interest in automating common proce- dure in analytical chemistry using machine learning, very few models have been adopted into everyday use. Among the analytical instruments available to chemists, Nuclear Mag- netic Resonance (NMR) spectroscopy is one of the most important, offering insights into molecular structure unobtainable with other methods. However, most processing and analysis of NMR spectra is still performed manually, making the task tedious and time consuming especially for larger quantities of spectra. We present a transformer-based machine learning model capable of predicting the molecular structure directly from the NMR spectrum. Our model is pretrained on synthetic NMR spectra, achieving a top–1 accuracy of 67.0% when predicting the structure from both the 1H and 13C spectrum. Additionally, we train a model which, given a spectrum and a set of likely compounds, selects the one corresponding to the spectrum. This model achieves a top–1 accuracy of 96.0% when trained on 1H spectra.
Supplementary weblinks
Title
Code:
Description
GitHub repository containing the scripts used in the creation of this manuscript.
Actions
View