Abstract
In mass spectrometry (MS)-based metabolomics, compound identification relies on Liquid Chromatography-MS (LC-MS) and Gas Chromatography-MS (GC-MS). The most popular and efficient approach for this purpose is the comparison of similarity scores between experimental spectra and reference spectra. Among the various single and composite similarity measures, the Cosine Correlation is widely favored due to its simplicity, efficiency, and effectiveness. Recently, the Shannon Entropy Correlation has shown superior performance over several other measures, including the Cosine Correlation, in LC-MS-based metabolomics, particularly concerning receiver operating characteristic (ROC) curves and false discovery rates. However, previous comparisons did not consider the weight factor transformation, which is critical for achieving higher accuracy with the cosine correlation. This study conducted a comparative analysis of the Cosine Correlation and Shannon Entropy Correlation, incorporating the weight factor transformation during preprocessing. Additionally, we developed a novel entropy correlation measure, the Tsallis Entropy Correlation, which offers greater versatility than the Shannon Entropy Correlation. Our results indicate that the weight factor transformation is essential for achieving higher accuracy in both LC-MS and GC-MS-based compound identification. While the Tsallis Entropy Correlation outperforms the Shannon Entropy Correlation, it is also more computationally expensive. The Cosine Correlation, when combined with the weight factor transformation, achieves the highest accuracy and the lowest computational expense, demonstrating its robustness and efficiency in MS-based compound identification.
Supplementary materials
Title
Supplementary Information
Description
This file includes (1) a derivation of the Tsallis Entropy Correlation and (2) figures and tables supplementary to the main manuscript.
Actions