Experience, knowledge and understanding of validating graph data with SHACL

Sam Hardy | Monday, March 12th, 2018

Our team used SHACL to validate RDF graph data whilst evaluating a Schema Analyser service; first Implemented at HMRC:
Discovery:
-Understood requirements/user journeys/user personas; recommending Alpha concentrates on Graph database technologies to map data relationships.
Alpha:
-During investigation into RDF Triple Store wrote validation service to validate graph data based on key user journeys; including integrations to source code (GIT)/deployable artifacts (Artifactory)/requirements (JIRA)/documentation (Confluence); all called through Jenkins pipelines; hosted on AWS.
-Replicated SQL constraints in Postgre SQL database using SHACL constraints including: Cardinality (max Count/min Count)/String based (max Length/min Length); providing a mechanism for validating migrated data between SQL database and RDF graph data.
-Added Shape based constraints including node/property; including complex restrictions by combining constraints including ‘node and property’; to investigate validating relations between data items.
-Conducted similar exercise for Cypher (declarative constraints/imperative triggers) on Neo4J; allowing direct comparison of graph technologies.

Filed under: HMRC, Solutions