Abstract
Many machine learning models used in academia and industry that identify organic compounds typically lack the ability to converse over prompts and results, and also require expertise across a number of steps to obtain answers. The purpose of this study was primarily to gain insight into the advantages of current unmodified state of the art Large Multimodal Models (LMMs) across several prompts containing multiple spectra of varying difficulty to evaluate the impact of training data, reasoning, and speed. These readily available and easy to use software for the identification of an organic compound based on a molecular formula and spectra were found to be reproducible across three similar LMMs. To the author's best knowledge, this marks the first time that three GPT variants were each able to correctly identify the organic compound quinoline using a variety of different spectroscopic images. The results were obtained using a 2-step process consisting of a) Uploading high resolution spectral images, and b) Submitting a text prompt with the images that requested a compound determination. The main findings were that 1) Four LMMs provided rationale step-by-step interpretations of 1H-NMR, 13C-NMR, and 3 DEPT-NMR spectra from Prompt A, 2) Three of these LMMs, led by a GPT-5 preview model, combined these interpretations into the correct chemical structure with Prompt A, and 3) Two of these LMMs achieved a top score of 5/5 for also generating sequential explanations reflecting the order of the provided spectra along with most of the correct spectral and molecular formula explanations.
Supplementary materials
Title
Supplementary Information
Description
The supplementary file contains all prompts with spectra and generations for each of the 5 Large Multimodal Models.
Actions
Title
High Resolution Image 186767
Description
Full sized image used in experiments: 186767.png
Actions
Title
High Resolution Image 264653
Description
Full sized image used in experiments: 264653.png
Actions
Title
High Resolution Image 274023_manuscript
Description
Full sized image used in manuscript: 274023_manuscript.png
Actions
Title
High Resolution Image 274023
Description
Full sized image used in experiments: 274023.png
Actions
Title
High Resolution Image 461190
Description
Full sized image used in experiments: 461190.png
Actions
Title
High Resolution Image 528506
Description
Full sized image used in experiments: 528506.png
Actions
Title
High Resolution Image 529852
Description
Full sized image used in experiments: 529852.png
Actions
Title
High Resolution Image 687359
Description
Full sized image used in experiments: 687359.png
Actions
Title
High Resolution Image 848904
Description
Full sized image used in experiments: 848904.png
Actions
Title
High Resolution Image 909495
Description
Full sized image used in experiments: 909495.png
Actions
Title
High Resolution Image 929369
Description
Full sized image used in experiments: 929369.png
Actions