Lab: SPARQL Programming: Difference between revisions
(→Topics) |
(→Tasks) |
||
Line 8: | Line 8: | ||
==Tasks== | ==Tasks== | ||
===SPARQL programming in Python with rdflib=== | ===SPARQL programming in Python with rdflib=== | ||
'''Getting ready:''' | |||
No additional installation is needed. You are already running Python and rdflib. | |||
Parse the file russia_investigation_kg.ttl into an rdflib Graph. | |||
'''Task:''' | |||
Write the following queries and updates with Python and rdflib: | |||
* Print out a list of all the predicates used in your graph. | |||
* Print out a sorted list of all the presidents represented in your graph. | |||
* Create dictionary (Python ''dict'') with all the represented presidents as keys. For each key, the value is a list of names of people indicted under that president. | |||
* Use an ASK query to investigate whether Donald Trump has pardoned more than 5 people. | |||
* Use a DESCRIBE query to create a new graph with information about Donald Trump. Print out the graph in Turtle format. | |||
===SPARQL programming in Python with SPARQLWrapper and Blazegraph=== | |||
'''Getting ready:''' | |||
Make sure you have to access to a running Blazegraph as in [[Lab: SPARQL | Exercise 3: SPARQL]]. You can either run Blazegraph locally on your own machine (best) or online on a shared server at UiB (also ok). | |||
Continue with the RDF graph you created in exercises 1-2 (perhaps creating a new namespace in Blazegraph first.) | |||
'''Task:''' | |||
Program the following queries and updates with SPARQLWrapper and Blazegraph. | |||
* | * Use a DESCRIBE query to create an rdflib Graph about Oliver Stone. Print the graph out in Turtle format. | ||
==With Blazegraph== | ==With Blazegraph== | ||
First, pip install SPARQLWrapper. If you are using Conda: open an Anaconda prompt, activate your environment with "conda activate [name of env]", and then "pip install sparqlwrapper" in the same prompt. Remember to select this Conda environment in your IDE. | First, pip install SPARQLWrapper. If you are using Conda: open an Anaconda prompt, activate your environment with "conda activate [name of env]", and then "pip install sparqlwrapper" in the same prompt. Remember to select this Conda environment in your IDE. |
Revision as of 20:03, 30 January 2023
Topics
SPARQL programming in Python:
- with rdflib: to manage an rdflib Graph internally in a program
- with SPARQLWrapper and Blazegraph: to manage an RDF graph stored externally in Blazegraph (on your own local machine or on the shared online server)
Motivation: Last week we entered SPARQL queries and updates manually from the web interface. But in the majority of cases we want to program the management of triples in our graphs, for example to handle automatic or scheduled updates.
Tasks
SPARQL programming in Python with rdflib
Getting ready: No additional installation is needed. You are already running Python and rdflib.
Parse the file russia_investigation_kg.ttl into an rdflib Graph.
Task: Write the following queries and updates with Python and rdflib:
- Print out a list of all the predicates used in your graph.
- Print out a sorted list of all the presidents represented in your graph.
- Create dictionary (Python dict) with all the represented presidents as keys. For each key, the value is a list of names of people indicted under that president.
- Use an ASK query to investigate whether Donald Trump has pardoned more than 5 people.
- Use a DESCRIBE query to create a new graph with information about Donald Trump. Print out the graph in Turtle format.
SPARQL programming in Python with SPARQLWrapper and Blazegraph
Getting ready: Make sure you have to access to a running Blazegraph as in Exercise 3: SPARQL. You can either run Blazegraph locally on your own machine (best) or online on a shared server at UiB (also ok).
Continue with the RDF graph you created in exercises 1-2 (perhaps creating a new namespace in Blazegraph first.)
Task: Program the following queries and updates with SPARQLWrapper and Blazegraph.
- Use a DESCRIBE query to create an rdflib Graph about Oliver Stone. Print the graph out in Turtle format.
With Blazegraph
First, pip install SPARQLWrapper. If you are using Conda: open an Anaconda prompt, activate your environment with "conda activate [name of env]", and then "pip install sparqlwrapper" in the same prompt. Remember to select this Conda environment in your IDE. The most important part is that we need to import a SPARQLWrapper in order to connect to the SPARQL endpoint of Blazegraph.
When it comes to how to do some queries and updates I recommend scrolling down on this page for help: https://github.com/RDFLib/sparqlwrapper. There are also some examples on our example page.
Remember, before you can program with Blazegraph you have to make sure its running like we did in Lab 4. Make sure that the URL you use with SPARQLWrapper has the same address and port as the one you get from running it. Now you will be able to program queries and updates.
# How to establish connection to Blazegraph endpoint. Also a quick select example.
from SPARQLWrapper import SPARQLWrapper, JSON
namespace = "kb"
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")
sparql.setQuery("""
PREFIX ex: <http://example.org/>
SELECT * WHERE {
ex:Cade ex:interest ?interest.
}
""")
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
for result in results["results"]["bindings"]:
print(result["interest"]["value"])
The different types of queries requires different return formats:
- SELECT and ASK: a SPARQL Results Document in XML, JSON, or CSV/TSV format.
- DESCRIBE and CONSTRUCT: an RDF graph serialized, for example, in the TURTLE or RDF/XML syntax, or an equivalent RDF graph serialization.
Remember to make sure that you can see the changes that take place after your inserts.
Without Blazegraph
If you have not been able to run Blazegraph on your own computer yet, you can use the UiB blazegraph service: i2s.uib.no:8888/bigdata/#splash. Remember to create your own namespace like said above in the web-interface.
Alternatively, you can instead program SPARQL queries directly with RDFlib.
For help, look at the link below:
Useful Readings
SPARQL Queries you can use for tasks
# SPARQL Queries
prefix ex: <http://example.org/>
# SELECT Every triple
SELECT * WHERE {?s ?p ?o}
# Select the interests of Cade
SELECT ?interest WHERE {ex:Cade ex:interest ?interest}
# SELECT only people who are older than 26
SELECT ?person ?age WHERE {?person ex:age ?age. FILTER(?age > 26)}
# SELECT The City and country of Cade
SELECT ?country ?city WHERE {ex:Cade ex:address ?address. ?address ex:country ?country. ?address ex:city ?city.}
# SELECT Everyone who graduated with a Bachelor Degree.
SELECT ?person ?level WHERE {?person ex:degree ?degree. ?degree ex:degreeLevel ?level. FILTER(?level="Bachelor")}
# DELETE Photography
PREFIX ex: <http://example.org/>
DELETE DATA
{
ex:Cade ex:interest ex:Photography.
}
# INSERT Sergio
PREFIX ex: <http://example.org/>
INSERT DATA
{
ex:Sergio ex:address ex:SergioAddress.
ex:SergioAddress ex:city ex:Valencia.
ex:SergioAddress ex:street "4 Carrer del Serpis".
ex:SergioAddress ex:postalCode "46021".
ex:SergioAddress ex:country ex:Spain.
ex:Sergio ex:address ex:SergioDegree.
ex:SergioDegree ex:degreeLevel "Master".
ex:SergioDegree ex:degreeField ex:Computer_Science.
ex:SergioDegree ex:degreeYear "2008".
ex:SergioDegree ex:degreeSource ex:University_of_Valencia.
ex:Sergio ex:expertise ex:Big_Data.
ex:Sergio ex:expertise ex:Semantic_Technologies.
ex:Sergio ex:expertise ex:Machine_Learning.
}
# DELETE Photography
PREFIX ex: <http://example.org/>
DELETE DATA
{
ex:Cade ex:interest ex:Photography.
}
# DELETE/INSERT University
prefix ex: <http://example.org/>
DELETE {
?s ?p ex:University_of_Valencia.
}
INSERT {?s ?p ex:Universidad_de_Valencia.}
WHERE {
?s ?p ex:University_of_Valencia.}
# Construct
prefix ex: <http://example.org/>
CONSTRUCT {?city ex:cityOf ?country}
WHERE {?address ex:city ?city. ?address ex:country ?country}
Triples that you can base your queries on: (turtle format)
@prefix ex: <http://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
ex:Cade a foaf:Person ;
ex:address [ a ex:Address ;
ex:city ex:Berkeley ;
ex:country ex:USA ;
ex:postalCode "94709"^^xsd:string ;
ex:state ex:California ;
ex:street "1516_Henry_Street"^^xsd:string ] ;
ex:age 27 ;
ex:characteristic ex:Kind ;
ex:degree [ ex:degreeField ex:Biology ;
ex:degreeLevel "Bachelor"^^xsd:string ;
ex:degreeSource ex:University_of_California ;
ex:year "2011-01-01"^^xsd:gYear ] ;
ex:interest ex:Bird,
ex:Ecology,
ex:Environmentalism,
ex:Photography,
ex:Travelling ;
ex:married ex:Mary ;
ex:meeting ex:Meeting1 ;
ex:visit ex:Canada,
ex:France,
ex:Germany ;
foaf:knows ex:Emma ;
foaf:name "Cade_Tracey"^^xsd:string .
ex:Mary a ex:Student,
foaf:Person ;
ex:age 26 ;
ex:characteristic ex:Kind ;
ex:interest ex:Biology,
ex:Chocolate,
ex:Hiking .
ex:Emma a foaf:Person ;
ex:address [ a ex:Address ;
ex:city ex:Valencia ;
ex:country ex:Spain ;
ex:postalCode "46020"^^xsd:string ;
ex:street "Carrer_de_la Guardia_Civil_20"^^xsd:string ] ;
ex:age 26 ;
ex:degree [ ex:degreeField ex:Chemistry ;
ex:degreeLevel "Master"^^xsd:string ;
ex:degreeSource ex:University_of_Valencia ;
ex:year "2015-01-01"^^xsd:gYear ] ;
ex:expertise ex:Air_Pollution,
ex:Toxic_Waste,
ex:Waste_Management ;
ex:interest ex:Bike_Riding,
ex:Music,
ex:Travelling ;
ex:meeting ex:Meeting1 ;
ex:visit ( ex:Portugal ex:Italy ex:France ex:Germany ex:Denmark ex:Sweden ) ;
foaf:name "Emma_Dominguez"^^xsd:string .
ex:Meeting1 a ex:Meeting ;
ex:date "August, 2014"^^xsd:string ;
ex:involved ex:Cade,
ex:Emma ;
ex:location ex:Paris .
ex:Paris a ex:City ;
ex:capitalOf ex:France ;
ex:locatedIn ex:France .
ex:France ex:capital ex:Paris .