Abstract
Per- and polyfluoroalkyl substances (PFAS) are of high concern, with calls to regulate these as a class. In 2021, the Organisation for Economic Co-operation and Development (OECD) revised the definition of PFAS to include any chemical containing at least one saturated CF2 or CF3 moiety. The consequence is that one of the largest open chemical collections, PubChem, with 115 million compounds, now contains over 7 million PFAS under this revised definition. These numbers are several orders of magnitude higher than previously established PFAS lists (typically thousands of entries) and pose an incredible challenge to researchers and computational workflows alike. This article describes a dynamic, openly accessible effort to navigate and explore the >7 million PFAS and >21 million fluorinated compounds (17 June 2023) in PubChem by establishing the “PFAS and Fluorinated Compounds in PubChem” Classification Browser (or “PubChem PFAS Tree”). A total of 36,500 nodes support browsing of the content according to several categories, including classification, structural properties, regulatory status, or presence in existing PFAS suspect lists. Additional annotation and associated data can be used to create subsets (and thus manageable suspect lists or databases) of interest for a wide range of environmental, regulatory, exposomics and other applications.
Supplementary weblinks
Title
PFAS and Fluorinated Compounds in PubChem Tree
Description
Direct link to the resource described in this article
Actions
View Title
PubChem PFAS Tree Documentation (PDF)
Description
A PDF of the documentation describing the PubChem PFAS Tree
Actions
View Title
Code supporting the PubChem PFAS Tree
Description
The code supporting the PubChem PFAS Tree is here
Actions
View Title
PERL code supporting the PubChem PFAS Tree
Description
The subfolder with PERL scripts to construct the PubChem PFAS Tree is here
Actions
View