Lab: RDF programming with RDFlib: Difference between revisions
No edit summary |
m (Added lab 1 and 2 presentation) |
||
(One intermediate revision by one other user not shown) | |||
Line 9: | Line 9: | ||
* [https://rdflib.readthedocs.io/en/stable/intro_to_graphs.html Navigating Graphs] | * [https://rdflib.readthedocs.io/en/stable/intro_to_graphs.html Navigating Graphs] | ||
* [https://rdflib.readthedocs.io/en/stable/intro_to_parsing.html Serialising and parsing] | * [https://rdflib.readthedocs.io/en/stable/intro_to_parsing.html Serialising and parsing] | ||
Lab Presentations: | |||
* [https://docs.google.com/presentation/d/1blXlTTTsL8jqeV5sRLhQuZ-nssNZqH_bjSgjj-kougE/edit?usp=sharing Lab 1 - RDF Presentation] | |||
* [https://docs.google.com/presentation/d/17yuNqn66fhEIHPE65PyFC0H359q1k6CquE6ojDV0NB4/edit?usp=sharing Lab 2 - More RDF Presentation] | |||
RDFlib classes/interfaces: | RDFlib classes/interfaces: | ||
Line 56: | Line 60: | ||
''Note:'' If you want a neat solution, it may be best to combine two graph traversals: first traverse the model breadth-first to create a new tree-shaped model, and then traverse the tree-shaped model depth-first to print it out with indentation. (The point of the first breadth-first step is to find the shortest path to each node.) | ''Note:'' If you want a neat solution, it may be best to combine two graph traversals: first traverse the model breadth-first to create a new tree-shaped model, and then traverse the tree-shaped model depth-first to print it out with indentation. (The point of the first breadth-first step is to find the shortest path to each node.) | ||
<!-- | |||
==Triples you can extend for the tasks (turtle format)== | ==Triples you can extend for the tasks (turtle format)== | ||
<syntaxhighlight> | <syntaxhighlight> | ||
Line 86: | Line 91: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
--> |
Latest revision as of 13:20, 28 January 2024
Topics
- RDF graph programming with RDFlib
Useful materials
RDFLib:
Lab Presentations:
RDFlib classes/interfaces:
- from rdflib import Graph, Namespace, URIRef, BNode, Literal
- from rdflib.namespace import RDF, FOAF, XSD
- from rdflib.collection import Collection
RDFlib methods:
- Graph: add(), remove(), triples(), serialize(), parse(), bind()
Tasks
Continue with the graph you created in Exercise 1.
Task: Continue to extend your graph:
- Michael Cohen was Donald Trump's attorney.
- He pleaded guilty for lying to Congress.
- Michael Flynn was adviser to Donald Trump.
- He pleaded guilty for lying to the FBI.
- He negotiated a plea agreement.
If you want, you can try to use properties and types from standard vocabularies like FOAF (friend-of-a-friend) and DC (Dublin Core), but this is something we will look at in later exercises.
Task: According to this FRONTLINE article, Gates', Cohen's and Flynn's lying were different and are described in different detail.
- How can you represent "different instances of lying" as triples?
- How can you modify your knowledge graph to account for this?
Task: It is possible to solve the task above without blank (or anonymous nodes). But to do so, you need to create a URI for each "instance of lying". This is a situation where blank nodes may be more suitable. Change your graph so it represents instances of lying as blank nodes.
Task: Save (serialize) your graph to a Turtle file. Add a few triples to the Turtle file with more information about Donald Trump. For example, you can add that Donald Trump is married to Melania and has several children. You can also use blank nodes to represent two of Trump's addresses when he was president:
- The White House, 1600 Pennsylvania Ave., NW Washington, DC 20500, United States, phone: 1-202-456-1414
- Mar-a-Lago Club, 1100 S Ocean Blvd, Palm Beach, FL 33480, United States
Visualise the result if you want. Read (parse) the Turtle file back into a Python program, and check that the new triples are there.
If you have more time...
Task: Write a method (function) that starts with Donald Trump prints out a graph depth-first to show how the other graph nodes are connected to him. An excerpt of the output could be:
ex:Donald_Trump <== ex:campaignManager ex:Paul_Manafort ==> ex:convictedFor ex:BankAndTaxFraud ... <== ex:attorneyFor ex:Michael_Cohen ==> ex:pleadedGuilty ex:LyingToCongress
Here, the <== and ==> arrows are printed to indicate the reverse of a property. We do that with a print() statement in Python, not from inside rdflib.
Note: Because you must follow triples in both subject-to-predicate and predicate-to-subject direction, you must keep a list of already visited nodes, and never return to a previously visited one.
Note: If you want a neat solution, it may be best to combine two graph traversals: first traverse the model breadth-first to create a new tree-shaped model, and then traverse the tree-shaped model depth-first to print it out with indentation. (The point of the first breadth-first step is to find the shortest path to each node.)