Abstract
Identification is a major challenge in metabolomics due to the large structural diversity of metabolites. Tandem mass spectrometry is a reference technology for studying the fragmentation of molecules and characterizing their structure. Recent instruments can fragment large amounts of compounds in a single acquisition. The search for similarities within a collection of MS/MS spectra is a powerful approach to facilitate the identification of new metabolites. We propose an innovative de novo strategy for searching for exact fragmentation patterns within collections of MS/MS spectra. This approach is based on i) a new representation of spectra as graphs of m/z differences, and ii) an efficient frequent-subgraph mining algorithm. We demonstrate both on a spectral database from standards and on acquisitions in biological matrices that these new fragmentation patterns capture similarities that are not extracted by existing methods, and facilitate the structural interpretation of molecular network components and the elucidation of unknown spectra. The mineMS2 software is publicly available as an R package (https://github.com/odisce/mineMS2).
Supplementary materials
Title
Supplementary file 1
Description
Notes about the mineMS2 algorithms, parameters used with mineMS2, MS2LDA, and GNPS, patterns explaining the GNPS similarities described in Hautbergue et al. (2017, 2019) and supplementary figures.
Actions
Title
Supplementary file 2
Description
Description of the ChemOnt concepts of the LIMS-DB dataset better explained by mineMS2 than MS2LDA.
Actions
Title
Supplementary file 3
Description
Description of the ChemOnt concepts of the LIMS-DB dataset better explained by MS2LDA than mineMS2.
Actions
Supplementary weblinks
Title
mineMS2 software
Description
Freely available R package, including example datasets and tutorials.
Actions
View