Lab: RDFS: Difference between revisions

From info216
No edit summary
Line 1: Line 1:
=Lab 7: RDFS Programming with rdflib and owlrl=
==Topics==
==Topics==
Basic RDFS graph programming in RDFlib.
* Simple RDFS statements/triples
Entailments and axioms with owlrl.
* Basic RDFS programming in RDFlib
* Basic RDFS reasoning with OWL-RL


==Classes/Methods/Vocabularies==
==Useful materials==
owlrl.RDFSClosure (RDFS_Semantics, closure, flush_stored_triples)
rdflib classes/interfaces and attributes/functions:
* RDF (RDF.type)
* RDFS (RDFS.domain, RDFS.range, RDFS.subClassOf, RDFS.subPropertyOf)


'''Vocabularies: '''
OWL-RL:
* [https://pypi.org/project/owlrl/ OWL-RL at PyPi]
* [https://owl-rl.readthedocs.io/en/latest/ OWL-RL Documentation]


RDF.type
OWL-RL classes/interfaces:
 
* RDFSClosure, RDFS_Semantics
RDFS.subClassOf, RDFS.subPropertyOf, RDFS.domain, RDFS.range, RDFS.label, RDFS.comment,  


==Tasks==
==Tasks==
First, pip install owlrl.
'''Task:'''  
The RDFS Vocabulary can be imported from rdflib.namespace, just like FOAF or RDF.
Install OWL-RL into your virtual environment:
 
pip install owlrl.
'''Consider the following Scenario:'''
"University of California and University of Valencia are both Universities.
All universities are higher education institutions (HEIs). Only persons can have an expertise, and what they have expertise in is always a subject. Only persons can graduate from a HEI. If you are a student, you are in fact a person as well. That a person is married to someone, means that they know them. Finally, if a person has a name, that name is also the label of that entity."
 
A business partner relationship is symmetric.
A campaign charman is a campaign official.
 
'''Create RDFS triples corresponding to the text above with RDFlib''' - if you can, try to build on
your example from lab 2!
 
To create the graph in python, you can just use the g.add syntax as we have done previously, or you can use the following code sample to parse a file into a graph:
 
<syntaxhighlight>
from rdflib import Graph, Namespace
import owlrl
 
 
# Create the graph
g = Graph()


# Parse input data into the graph, format is dependent on the file format. Here turtle (ttl). And location is the path to the local file
'''Task:'''
g.parse(location="input.ttl", format="turtle")
We will use simple examples from the Mueller investigation RDF graph you made in Exercise 1.
</syntaxhighlight>


Create a new rdflib graph and add in "plain RDF" like you did in Exercise 1:
* Rick Gates was charged with money laundering and tax evasion.


Using these three lines we can add automatically the inferred triples (like ex:University rdf:type ex:Higher_Education_Institute) :
Use RDFS to add these rules as triples:
<syntaxhighlight>
* When one thing that is charged with another thing,
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)
** the first thing is a person under investigation and
rdfs.closure()
** the second thing is an offense.
rdfs.flush_stored_triples()
</syntaxhighlight>


After you have done this, try to add the following scenario to you graph as well:
You can add triples using simple ''rdflib.add((s, p, o))'' or using ''INSERT DATA {...}'' SPARQL updates. If you use SPARQL updates, you can define a namespace dictionary like this:
"Having a degree from a HEI means that you have also graduated from that HEI. That a city is a capital of a country means that this city is located in that country. That someone was involved in a meeting, means that they have met the other participants. If someone partook in a meeting somewhere, means that they have visited that place"
EX = Namespace('http://example.org#')
To do this, you will have to swap out the line
NS = {
<syntaxhighlight>
    'ex': EX,
rdfs = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)
    'rdf': RDF,
</syntaxhighlight>
    'rdfs': RDFS,
with
    'foaf': FOAF,
<syntaxhighlight>
}
rdfs = owlrl.OWLRL.OWLRL_Semantics(g, False, False, False)
You can then give NS as an optional argument to graph.update() - or to graph.query() - like this:
</syntaxhighlight>
g.update("""
As last two sentences require more advanced reasoning with OWL. Or you can use what we already learned: g.query(), CONSTRUCT the relevant triples, and add them to the graph.
    # when you provide an initNs-argument, you do not have
    # to define PREFIX-es as part of the update (or query)
    INSERT DATA {
        # your SPARQL update goes here
    }
""", initNs=NS)


Check that simple inference works -  make sure that your graph contains triples like these, even if
'''Task:'''
you have not asserted them explicitly:
* Write a SPARQL query that checks the RDF type(s) of Rick Gates in your RDF graph.
* that University of California and Valencia are HEIs
* Write a similar SPARQL query that checks the RDF type(s) of money laundering in your RDF graph.
* that Cade, Emma, and Mary are all persons
* Write a small function that ''computes the RDFS closure'' on your graph.
* that Cade and Emma have both graduated from some HEI
* Re-run the SPARQL queries to check the types of Rick Gates and of money laundering again.
* that Cade knows Mary


One way to check if the triples are there:
You can compute the RDFS closure on a graph like this:
<syntaxhighlight>
engine = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)
universities = g.query("""
engine.closure()
PREFIX ex: <http://example.org/>
engine.flush_stored_triples()
ASK {
    ex:University_of_California rdf:type ex:Higher_Education_Institution.
}
""")
print(bool(universities))
</syntaxhighlight>


Rewrite some of your existing code to use rdfs:label in a triple and add an rdfs:comment to the same resource.
'''Task:'''
Use RDFS to add this rule as a triple:
* A person under investigation is a FOAF person.
* Like earlier, check the RDF types of Rick Gates before and after running RDFS reasoning.


==If you have more time...==
'''Task:'''
Create a new RDFS graph that wraps an empty graph. This graph contains only RDFS axioms. Write it out in Turtle and check that you understand  the meaning and purpose of each axiom.
Add in "plain RDF" as in Exercise 1:
* Paul Manafort was convicted for tax evasion.


Create an RDF (not RDFS) graph that contains all the triples in your first graph (the one with all the people and universities). Subtract all the triples in the axiom graph from the people/university graph. Write it out to see that you are left with only the asserted and entailed triples and that none of the axioms remain.
Use RDFS to add these rules as triples:
* When one thing is ''convicted for'' another thing,
** the first thing is also ''charged with'' the second thing.


<!-- Download the SKOS vocabulary from https://www.w3.org/2009/08/skos-reference/skos.rdf and save it to a file called, e.g., SKOS.rdf .
''Note:'' we are dealing with a "timeless" graph here, that represents facts that has held at "some points in time", but not necessarily at the same time.
Use the schemagen tool (it is inside your Jena folders, for example under apache-jena-3.1.1/bin) to generate a Java class for the SKOS vocabulary.
You need to do this from a console window, using a command like "<path>/schemagen -i <infile.rdf> -o <outfile.java>".


Copy the SKOS.java file into your project in the same package as your other Java files,  and try to use SKOS properties
* What are the RDF types of Paul Manafort and of tax evasion before and after RDFS reasoning?
where they fit, for example to organise the keywords for interests and expertise.
* Does the RDFS domain and range of the ''convicted for'' property change?
-->


==Useful Readings==
==If you have more time...=
*[https://wiki.uib.no/info216/index.php/File:S05-RDFS-11.pdf Lecture Notes]
*[https://wiki.uib.no/info216/index.php/Python_Examples Example page]

Revision as of 15:15, 18 February 2023

Topics

  • Simple RDFS statements/triples
  • Basic RDFS programming in RDFlib
  • Basic RDFS reasoning with OWL-RL

Useful materials

rdflib classes/interfaces and attributes/functions:

  • RDF (RDF.type)
  • RDFS (RDFS.domain, RDFS.range, RDFS.subClassOf, RDFS.subPropertyOf)

OWL-RL:

OWL-RL classes/interfaces:

  • RDFSClosure, RDFS_Semantics

Tasks

Task: Install OWL-RL into your virtual environment:

pip install owlrl.

Task: We will use simple examples from the Mueller investigation RDF graph you made in Exercise 1.

Create a new rdflib graph and add in "plain RDF" like you did in Exercise 1:

  • Rick Gates was charged with money laundering and tax evasion.

Use RDFS to add these rules as triples:

  • When one thing that is charged with another thing,
    • the first thing is a person under investigation and
    • the second thing is an offense.

You can add triples using simple rdflib.add((s, p, o)) or using INSERT DATA {...} SPARQL updates. If you use SPARQL updates, you can define a namespace dictionary like this:

EX = Namespace('http://example.org#')
NS = {
    'ex': EX,
    'rdf': RDF,
    'rdfs': RDFS,
    'foaf': FOAF,
}

You can then give NS as an optional argument to graph.update() - or to graph.query() - like this:

g.update("""
    # when you provide an initNs-argument, you do not have 
    # to define PREFIX-es as part of the update (or query)
    INSERT DATA {
        # your SPARQL update goes here
    }
""", initNs=NS)

Task:

  • Write a SPARQL query that checks the RDF type(s) of Rick Gates in your RDF graph.
  • Write a similar SPARQL query that checks the RDF type(s) of money laundering in your RDF graph.
  • Write a small function that computes the RDFS closure on your graph.
  • Re-run the SPARQL queries to check the types of Rick Gates and of money laundering again.

You can compute the RDFS closure on a graph like this:

engine = owlrl.RDFSClosure.RDFS_Semantics(g, False, False, False)
engine.closure()
engine.flush_stored_triples()

Task: Use RDFS to add this rule as a triple:

  • A person under investigation is a FOAF person.
  • Like earlier, check the RDF types of Rick Gates before and after running RDFS reasoning.

Task: Add in "plain RDF" as in Exercise 1:

  • Paul Manafort was convicted for tax evasion.

Use RDFS to add these rules as triples:

  • When one thing is convicted for another thing,
    • the first thing is also charged with the second thing.

Note: we are dealing with a "timeless" graph here, that represents facts that has held at "some points in time", but not necessarily at the same time.

  • What are the RDF types of Paul Manafort and of tax evasion before and after RDFS reasoning?
  • Does the RDFS domain and range of the convicted for property change?

=If you have more time...