Abstract
In the screening of prohibited substances (PS) in horse biological samples with Gas Chromatography/Mass Spectrometry (GC/MS) and Liquid Chromatography/Mass Spectrometry (LC/MS) for doping control, enormous number of chromatograms are generated. Reviewing these chromatograms to identify suspicious findings requires extensive manual effort. Recent advancement in Artificial Intelligence (AI) enables its use to classify images into different categories. This can potentially be utilised to perform first-line analysis of chromatograms, which are usually displayed as images, by classifying them into "positive" (POS) or "negative" (NEG) in respect of the presence of PS.
This study explores the feasibility of using AI to perform initial chromatogram analysis, aiming to improve the efficiency and accuracy of data vetting. A predictive model was developed using the image recognition tool in "Alteryx Designer", a data analytic software, to analyse chromatograms generated from LC/MS analysis of horse urine. The model was developed by training with over 6000 chromatograms that had manually been classified as "POS" or "NEG". To evaluate the model’s accuracy, around 700 manually-classified chromatograms were analysed by the model and the prediction accuracy was over 90 %.
The model was applied to two of our in-house screening methods each covering over 300 drug targets. It was shown that the model can identify "SUS"/ "POS" and "NEG" chromatograms with high accuracy with no false negative classification. There are two major challenges in applying the developed model to perform first-line analysis in regular testing, with the first being the analysis time. With the existing Alteryx workflow, analysing one batch of samples from one of our in-house screening methods with a standard office PC requires 3-5 hours. The second challenge is the inflexibility of data extraction workflow. The workflow only works on analytical data generated from specific instruments and software which poses challenges to its implementation in regular testing which involves large variety of instruments and processing software.