Most organizations define data quality requirements for their data assets. Those that don’t, probably should. Examples include:
- term names with length or formatting requirements
- rules for identifiers that depend on hierarchy level
- fields with required or unique values.
Within ontologies, rules often become even richer and more comprehensive. Why is having rules and ensuring that data adheres to them important? First, rules encode and enforce best practices to be followed by data creators. Secondly, best practices ensure quality of applications that use the data - be it search, content classification or website navigation.
For several years, organizations have used W3C standards: SKOS for taxonomies, RDF/OWL for ontologies. Using standards enables interoperability and re-use of assets and skills. Until now, there wasn’t a standard for defining rules or for checking data conformance. Organizations used proprietary approaches or simply stated their rules on paper.
Enter SHACL (Shapes Constraint Language) – the new W3C standard that addresses this problem. SHACL offers rich, flexible notations for expressing practically any rule. This tutorial will:
- introduce SHACL and the motivations behind its creation
- provide detailed examples of its use to ensure data quality
- provide hands-on practice in defining and testing rules
- discuss SHACL’s support within available tools
CVs of the organizers
Irene Polikoff has more than two decades of experience in software development, management, consulting and strategic planning. Since co-founding TopQuadrant in 2001 Irene has been involved in more than a dozen projects in government and commercial sectors. She has written strategy papers, trained customers on the use of the Semantic Web standards, developed ontology models, designed solution architectures and defined deployment processes and guidance.
Eric Freese is an industry veteran with extensive experience in IT, publishing, project management, product management, software development, consulting and training. He has worked in a wide range of industries of industries, from technical publications and defense, to legal, to education and publishing. He is an internationally recognized subject matter expert in XML-related strategies for knowledge management and representation, digital publishing, and content management. As a Sr. Semantic Solutions Architect at TopQuadrant, Eric performs solution requirements, architecture analysis, design and semantic model-driven application development and training ensuring that customer requirements are met when implementing the TopBraid platform.