Lab: SHACL: Difference between revisions
No edit summary |
(→Tasks) |
||
Line 14: | Line 14: | ||
==Tasks== | ==Tasks== | ||
'''Task:''' | '''Task:''' | ||
Go to the interactive, online [https://shacl.org/playground/ SHACL Playground]. | Go to the interactive, online [https://shacl.org/playground/ SHACL Playground]. Cut-and-paste the Turtle triples below into the Data Graph text field. | ||
<syntaxhighlight> | |||
@prefix ex: <http://example.org/> . | |||
@prefix foaf: <http://xmlns.com/foaf/0.1/> . | |||
ex:Paul_Manafort | |||
a ex:PersonUnderInvestigation ; | |||
ex:hasBusinessPartner ex:Rick_Gates . | |||
ex:Rick_Gates | |||
a ex:PersonUnderInvestigation ; | |||
foaf:name | |||
"Rick Gates" , | |||
"Richard William Gates III"@en ; | |||
ex:chargedWith | |||
"Foreign Lobbying"@en , | |||
ex:MoneyLaundering , | |||
ex:TaxEvasion . | |||
</syntaxhighlight> | |||
The example is based on Exercises 1 and 2. Take some time to look at it in Turtle and also in JSON-LD, using the drop-down menu next to the ''Data Graph'' heading. | |||
'''Task:''' | '''Task:''' | ||
Line 20: | Line 39: | ||
You can use the following prefixes: | You can use the following prefixes: | ||
@prefix ex: <http://example.org/> . | |||
@prefix foaf: <http://xmlns.com/foaf/0.1/> . | |||
@prefix sh: <http://www.w3.org/ns/shacl#> . | |||
* Every person under investigation has exactly one name. | |||
* All person names must be language-tagged. | |||
* The value of a charged with property must be a URI. | |||
Difficult: | |||
* | * If one person (under investigation) has another as business partner, the second person must have the first as business partner in return. | ||
'''Task:''' | '''Task:''' | ||
Write a Python program using rdflib and pySHACL, which: | Write a Python program using rdflib and pySHACL, which: | ||
# parses the | # parses the Turtle example above into a ''data_graph'' | ||
# parses the contents of a ''shape_graph'' you made in the previous task (for example checking | ** ''Tip:'' you can either save it to file, or parse directly from a string using ''graph.parse(data=turtle_data, format='ttl')'' | ||
# parses the contents of a ''shape_graph'' you made in the previous task (for example checking that every person under investigation has exactly one name), | |||
# uses pySHACL's validate method to apply the ''shape_graph'' constraints to the ''data_graph'', and | # uses pySHACL's validate method to apply the ''shape_graph'' constraints to the ''data_graph'', and | ||
# print out the validation result (a boolean value, a ''results_graph'', and a ''result_text''). | # print out the validation result (a boolean value, a ''results_graph'', and a ''result_text''). | ||
'''Task:''' | '''Task:''' | ||
Add the Turtle triples below (from exercise 3-5) to your ''data_graph''. | |||
<syntaxhighlight> | |||
ex:investigation_162 a ex:Indictment ; | |||
ex:american "Yes" ; | |||
ex:cp_date "2018-02-23"^^xsd:date ; | |||
ex:cp_days 282 ; | |||
ex:indictment_days 166 ; | |||
ex:investigation ex:russia ; | |||
ex:investigation_days 659.0 ; | |||
# ex:investigation_end "None" ; | |||
ex:investigation_start "2017-05-17" ; | |||
ex:name ex:Rick_Gates ; | |||
ex:outcome ex:guilty-plea ; | |||
ex:overturned false ; | |||
ex:pardoned false ; | |||
ex:president "Donald Trump"@en . | |||
</syntaxhighlight> | |||
Download the whole [kg4news.ttl KG4NEWS graph] we used in the SPARQL lecture (S03) and parse it into the data graph. Re-run a selection of your ''shape_graph'' constraints on the larger graph. | Download the whole [kg4news.ttl KG4NEWS graph] we used in the SPARQL lecture (S03) and parse it into the data graph. Re-run a selection of your ''shape_graph'' constraints on the larger graph. | ||
Line 56: | Line 88: | ||
'''Task:''' Install pySHACL into your virtual environment: | '''Task:''' Install pySHACL into your virtual environment: | ||
pip install pyshacl | pip install pyshacl | ||
==If you have more time== | ==If you have more time== |
Revision as of 18:51, 18 February 2023
Topics
- Validating RDF graphs with SHACL
- Running pySHACL
Useful materials
SHACL:
- Section 7.4 Expectation in RDF in Allemang, Hendler & Gandon's textbook (Semantic Web for the Working Ontologist)
- Chapter 5 SHACL in Validating RDF (available online)
- Interactive, online SHACL Playground
pySHACL:
- pySHACL at PyPi.org After installation, go straight to "Python Module Use".
Tasks
Task: Go to the interactive, online SHACL Playground. Cut-and-paste the Turtle triples below into the Data Graph text field.
@prefix ex: <http://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
ex:Paul_Manafort
a ex:PersonUnderInvestigation ;
ex:hasBusinessPartner ex:Rick_Gates .
ex:Rick_Gates
a ex:PersonUnderInvestigation ;
foaf:name
"Rick Gates" ,
"Richard William Gates III"@en ;
ex:chargedWith
"Foreign Lobbying"@en ,
ex:MoneyLaundering ,
ex:TaxEvasion .
The example is based on Exercises 1 and 2. Take some time to look at it in Turtle and also in JSON-LD, using the drop-down menu next to the Data Graph heading.
Task: Write Shapes Graphs in Turtle (recommended) or JSON-LD for each of the constraints below. Keep copies of your of your Shape Graphs in a separate text editor and file. You will need them later. Each time you have entered a Shape Graph into the text field, click Update to validate the contents of the Data Graph.
You can use the following prefixes:
@prefix ex: <http://example.org/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix sh: <http://www.w3.org/ns/shacl#> .
- Every person under investigation has exactly one name.
- All person names must be language-tagged.
- The value of a charged with property must be a URI.
Difficult:
- If one person (under investigation) has another as business partner, the second person must have the first as business partner in return.
Task: Write a Python program using rdflib and pySHACL, which:
- parses the Turtle example above into a data_graph
- Tip: you can either save it to file, or parse directly from a string using graph.parse(data=turtle_data, format='ttl')
- parses the contents of a shape_graph you made in the previous task (for example checking that every person under investigation has exactly one name),
- uses pySHACL's validate method to apply the shape_graph constraints to the data_graph, and
- print out the validation result (a boolean value, a results_graph, and a result_text).
Task: Add the Turtle triples below (from exercise 3-5) to your data_graph.
ex:investigation_162 a ex:Indictment ;
ex:american "Yes" ;
ex:cp_date "2018-02-23"^^xsd:date ;
ex:cp_days 282 ;
ex:indictment_days 166 ;
ex:investigation ex:russia ;
ex:investigation_days 659.0 ;
# ex:investigation_end "None" ;
ex:investigation_start "2017-05-17" ;
ex:name ex:Rick_Gates ;
ex:outcome ex:guilty-plea ;
ex:overturned false ;
ex:pardoned false ;
ex:president "Donald Trump"@en .
Download the whole [kg4news.ttl KG4NEWS graph] we used in the SPARQL lecture (S03) and parse it into the data graph. Re-run a selection of your shape_graph constraints on the larger graph.
Task: In some cases, the results_graph and result_text will report the same error many times, but for different nodes. Write a SPARQL query to print out each distinct sh:xxxMessage in the results_graph.
Task: Modify the above query so it prints out each sh:xxxMessage in the results_graph once, along with the number of times the message has been repeated in the results.
Task: Install pySHACL into your virtual environment:
pip install pyshacl
If you have more time
Task: Fix kg4news.txt (renamed to .ttl) so that:
- Every kg:year value has rdf:type xsd:year .
- xxx