You are here: indexmembersbernard.gibaudmedirad-page

Development of a Semantic database for the MEDIRAD Image and Radiation Dose BioBank (IRDBB)

Members

Former members

  • Marine Brenet - Software Engineer

General purpose

This work addresses the development of a computer system called Image and Radiation Dose BioBank (IRDBB) designed to manage image and dosimetric data in an integrated way. This development is exemplary of the strategy promoted by our team for implementing imaging biobanks in the future. This strategy puts emphasis on adhering to the F.A.I.R. principles, i.e. making sure that the shared data is: Findable, Accessible, Interoperable and Reusable. This strategy recommends using the semantic web technologies (ontologies) to ensure the precise definition of shared information, and the use of standards (among which the DICOM Standard).

Our contribution to the development of the IRDBB system concerns the development of a semantic database implemented as a Resource Description Framework (RDF) graph aligned onto an application ontology called OntoMEDIRAD, that specifies the semantics of any information within this database.

The IRDBB system was designed to fulfil the needs of the researchers involved in the MEDIRAD project. Both the IRDBB system and the OntoMEDIRAD ontology were developed with an objective of extensibility and reusability in the context of similar projects. The choice of an ontology-based approach aims eventually at facilitating the access to MEDIRAD research data to a wide community of researchers interested in low dose research, e.g. via federated systems.

Overall IRDBB architecture

The overall architecture of the IRDBB system is shown in Fig. 1.

 Figure 1

The major components are:

  • a component called IRDBB_UI, which is a web server managing the user interface developped by the b<>com Technology Institute
  • a component called KHEOPS, developed in Geneva by the Osman Ratib's group in ITMI, managing the DICOM data (based on the DCM4CHE software)
  • a component called FHIR repository, also developped by b<>com , managing all non-DICOM files
  • a component called Semantic Translator, providing a set of services to populate and query the semantic database
  • a STARDOG Triple Store, supporting the semantic database
  • a component called SPARKLIS Portal, extending the IRDBB_UI to assist the users in building SPARQL queries
  • a component called Keycloak, also provided by ITMI and b<>com providing a Single Sign-On mechanism for access control.

OntoMEDIRAD ontology

This ontology was developed iteratively between 2017 and 2020. It aims at addressing the needs expressed by the MEDIRAD participants in their answers to the questionnaire sent in October 2017 (User needs and competency questions concerning the IRDBB repository). The ontology is organized as a set of files represented in OWL, the Web Ontology Language (Fig. 2).

 Figure 2

The ontology was designed as an application ontology gathering all entities and relationships involved. The general modelling approach was a realist one, i.e. trying to refer to entities existing in reality. Of course, we reused as much as possible existing ontological resources. Therefore, we adopted an organization in modules, in which the root application ontology (called OntoMEDIRAD) imports several extracts of existing ontologies. These extracts, e.g. from the Foundational Model of Anatomy, the Units Ontology (UO), the Phenotype and Trait Ontology (PATO) were generated using the OntoFox tool, based on the MIREOT model. The overall integration of these disparate ingredients relied on the common philosophical ground provided by the Basic Formal Ontology (BFO version 23).

Fig. 3 shows an extract of the OntoMEDIRAD ontology.

 Figure 3

The OntoMEDIRAD ontology can be freely download and reused:OntoMEDIRAD Files. The paper to be cited in reference is the AMIA 2020 paper (see below).

Semantic database

The Semantic database is an RDF graph containing RDF assertions that document the nature and provenance of all the data shared in the MEDIRAD IRDBB repository, i.e. DICOM or non-DICOM data.

The DICOM data concern:

  • image data such as CT images, that correspond directly to irradiation events
  • image data such as SPECT or PET images, that may either correspond to irradiation events (due to the injection of a radiopharmaceutical) with some diagnostic goal, or correspond to a strategy to estimate the biodistribution of the radiopharmaceutical used in an internal radiotherapy (e.g. 131Iodine treatment of thyroïd cancer)
  • structured reports such as CT Radiation Dose Structured Reports
  • other structured reports, e.g. implementing e-CRF documenting the internal radiotherapy treatments in WP3.

The non-DICOM data concern principally:

  • voxelized dose maps calculated by Monte Carlo simulation
  • segmentation of organs and tissues of interest.

The Semantic database is populated by the Semantic Translator, this processing is completed during the importation of the data files into the IRDBB system. The Semantic database is supported by the Stardog Triple store.

The Semantic database can be queried through the IRDBB_UI web interface. Two ways are proposed:

  • using predefined SPARQL queries
  • using the Sparklis tool, a tool allowing the end-users to freely navigate in the RDF graph and build their own SPARQL queries. The SPARKLIS tool was provided free of charge by Sébastien Ferré (University of Rennes 1).

Semantic Translator

This software was designed by Marine Brenet. It is implemented as a set of services, called by IRDBB_UI or KHEOPS. The main services concern the creation of the RDF assertions describing the nature and provenance of the DICOM and non-DICOM data, and the management of predefined SPARQL queries. For DICOM data, the Semantic translator translates into RDF the key DICOM metadata. For non-DICOM data, the relevant provenance metadata are provided in an XML file (non-DICOM File set descriptor) that is part of the non-DICOM File set to be imported.

Collaborations

Direct collaborators in the development of IRDBB

  • Guillaume Pasquier, b<>com, Rennes
  • Joël Spaltenstein, Institute of Translational Molecular Imaging (ITMI), Genève
  • Nicolas Van Dooren, ITMI, Genève
  • Osman Ratib, ITMI, Genève

Other collaborators

  • John Stratakis, University of Crete
  • John Damilakis, University of Crete
  • Manuel Bardiès, CRCT Inserm, Toulouse
  • Alex Vergara Gil, CRCT Inserm, Toulouse

Former collaborators in the development of IRDBB

  • Cédric Moubri-Tournes, b<>com, Rennes
  • Eric Guiffard, b<>com, Rennes

Technical reports

  1. Documentation of the ontology of the IRDBB semantic repository, Version 1.3, MEDIRAD Technical report, MS8 Milestone, 2020. pdf
  2. Documentation of the Semantic Translator software, Version 1, MEDIRAD Technical report, 2020. pdf
  3. Documentation of predefined SPARQL queries, Version 1, MEDIRAD Technical report, 2020. pdf

Publications

  1. Spaltenstein J, Roduit N, van Dooren N, Pasquier G, Brenet M, Gibaud B, Pasquier G, Mildenberger P and Ratib O. A multicentric IT platform for storage and sharing of imaging-based radiation dosimetric data. CARS 2020 Munich (Germany).paper
  2. Gibaud B, Brenet M, Pasquier G, Vergara Gil A, Bardiès M, Stratakis J, Damilakis J, van Dooren N, Spaltenstein J, and Ratib O. A semantic database for integrated management of image and dosimetric data in low radiation dose research in medical imaging. American Medical Informatics Association (AMIA) Conference, November 2020, Chicago (USA), 492-501. paper slides

Funding

This work was supported by the European Commission as part of the MEDIRAD project (Number 755523) in the Horizon 2020 Program (EURATOM NFRP-2016-2017).

inserm rennes1 ltsi