Abstract
The nanosafety domain has seen significant advancements in data generation and sharing, yet challenges remain in ensuring data interoperability and reuse. This article focuses on developing a semantic interoperability framework for nanosafety data to maximize the FAIRness (Findability, Accessibility, Interoperability, and Reusability) of existing and new datasets. The approach centers on creating the NanoLinks semantic model, which unifies diverse data modalities representation, such as cytotoxicity, transcriptomics, and physicochemical data. By leveraging established ontologies like the BioAssay Ontology (BAO), NanoParticle Ontology (NPO), Data Catalog vocabulary (DCAT) and the PROV-O ontology, NanoLinks facilitates the conversion of semi-structured data into Resource Description Framework (RDF) format using the RDF Mapping Language (RML). This transformation allows to generate machine-readable and interoperable datasets. Five datasets from the literature, spanning nanomaterial characteristics and biological assay data, were selected by the NanoSolveIT EU project partners for FAIRification. These datasets were converted into RDF format, hosted on Zenodo under a CC-BY 4.0 license, and integrated into a knowledge graph, NanoLinks-KG, following the linked-data best practices. The knowledge graph was validated for consistency and adherence to the semantic model using shape expressions (ShEx). The presented applications of this graph showcase the potential of querying interconnected datasets to derive insights and support integration with external resources such as AOP-Wiki and the NanoCommons knowledge base. One usage example given is the cross-dataset dose-response curve comparison of zinc oxide nanomaterials. The results demonstrate the successful application of semantic modeling and linked-data knowledge graphs to convert and integrate diverse nanosafety datasets, enhancing their interoperability, and promoting reuse. The developed framework advances the state of data sharing in the nanosafety community and demonstrates the potential of semantic technologies in facilitating comprehensive data analysis and novel discoveries in the field.
Supplementary materials
Title
SPARQL queries used to summarize the content of the RDFied datasets
Description
SPARQL queries used to summarize the content of the RDFied datasets
Actions
Title
Results of a federated SPARQL query integrating NanoLinks-KG nanomaterial types with adverse outcomes from AOP-Wiki
Description
Results of a federated SPARQL query integrating NanoLinks-KG nanomaterial types with adverse outcomes from AOP-Wiki
Actions