Lab: SHACL: Difference between revisions

From info216
No edit summary
Line 39: Line 39:


You can use the following prefixes:
You can use the following prefixes:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
  @prefix ex: <http://example.org/> .
  @prefix ex: <http://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .


* Every person under investigation has exactly one name.
* Every person under investigation has exactly one name.
* The object of a charged with property must be a URI.
* The object of a charged with property must be an offense.
* All person names must be language-tagged.
* All person names must be language-tagged.
* The value of a charged with property must be a URI.


Difficult:
Change the ''data_graph'' to remove the detected errors.
 
''Difficult:''
* If one person (under investigation) has another as business partner, the second person must have the first as business partner in return.
* If one person (under investigation) has another as business partner, the second person must have the first as business partner in return.



Revision as of 10:24, 19 February 2023

Topics

  • Validating RDF graphs with SHACL
  • Running pySHACL

Useful materials

SHACL:

pySHACL:

Tasks

Task: Go to the interactive, online SHACL Playground. Cut-and-paste the Turtle triples below into the Data Graph text field.

@prefix ex: <http://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

ex:Paul_Manafort 
    a ex:PersonUnderInvestigation ;
    ex:hasBusinessPartner ex:Rick_Gates .

ex:Rick_Gates 
    a ex:PersonUnderInvestigation ;
    foaf:name 
        "Rick Gates" ,
        "Richard William Gates III"@en ;
    ex:chargedWith 
        "Foreign Lobbying"@en ,
        ex:MoneyLaundering ,
        ex:TaxEvasion .

The example is based on Exercises 1 and 2. Take some time to look at it in Turtle and also in JSON-LD, using the drop-down menu next to the Data Graph heading.

Task: Write Shapes Graphs in Turtle (recommended) or JSON-LD for each of the constraints below. Keep copies of your of your Shape Graphs in a separate text editor and file. You will need them later. Each time you have entered a Shape Graph into the text field, click Update to validate the contents of the Data Graph.

You can use the following prefixes:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix ex: <http://example.org/> .
  • Every person under investigation has exactly one name.
  • The object of a charged with property must be a URI.
  • The object of a charged with property must be an offense.
  • All person names must be language-tagged.

Change the data_graph to remove the detected errors.

Difficult:

  • If one person (under investigation) has another as business partner, the second person must have the first as business partner in return.

Task: Write a Python program using rdflib and pySHACL, which:

  1. parses the Turtle example above into a data_graph
    • Tip: you can either save it to file, or parse directly from a string using graph.parse(data=turtle_data, format='ttl')
  1. parses the contents of a shape_graph you made in the previous task (for example checking that every person under investigation has exactly one name),
  2. uses pySHACL's validate method to apply the shape_graph constraints to the data_graph, and
  3. print out the validation result (a boolean value, a results_graph, and a result_text).

Task: Add the Turtle triples below (from exercise 3-5) to your data_graph.

ex:investigation_162 a ex:Indictment ;
    ex:american "Yes" ;
    ex:cp_date "2018-02-23"^^xsd:date ;
    ex:cp_days 282 ;
    ex:indictment_days 166 ;
    ex:investigation ex:russia ;
    ex:investigation_days 659.0 ;
    # ex:investigation_end "None" ;
    ex:investigation_start "2017-05-17" ;
    ex:name ex:Rick_Gates ;
    ex:outcome ex:guilty-plea ;
    ex:overturned false ;
    ex:pardoned false ;
    ex:president "Donald Trump"@en .

Download the whole [kg4news.ttl KG4NEWS graph] we used in the SPARQL lecture (S03) and parse it into the data graph. Re-run a selection of your shape_graph constraints on the larger graph.

Task: In some cases, the results_graph and result_text will report the same error many times, but for different nodes. Write a SPARQL query to print out each distinct sh:xxxMessage in the results_graph.

Task: Modify the above query so it prints out each sh:xxxMessage in the results_graph once, along with the number of times the message has been repeated in the results.


Task: Install pySHACL into your virtual environment:

pip install pyshacl

If you have more time

Task: Fix kg4news.txt (renamed to .ttl) so that:

  • Every kg:year value has rdf:type xsd:year .
  • xxx