Lab: SPARQL Programming: Difference between revisions

From info216
Line 8: Line 8:
==Tasks==
==Tasks==
===SPARQL programming in Python with rdflib===
===SPARQL programming in Python with rdflib===
Remember, before you can interact with Blazegraph you have to make sure its running like we did in [https://wiki.uib.no/info216/index.php/Lab:_SPARQL Lab 4].
'''Getting ready:'''
*'''Make a new blazegraph namespace from the blazegraph web-interface and add all the triples that are on the bottom of the page like we did in [https://wiki.uib.no/info216/index.php/Lab:_SPARQL Lab 4]'''
No additional installation is needed. You are already running Python and rdflib.
Alternatively you can use your own triples if you have them.  


The default namespace for blazegraph is "kb". If you want to add other namespaces you can do it from the web-interface of Blazegraph, from the "Namespace" Tab. Remember to click "Use" on the namespace after you have created it.
Parse the file russia_investigation_kg.ttl into an rdflib Graph.  


The different namespaces for blazegraph acts as seperate graphs/databases. This is especially useful if you are using the UiB link to blazegraph: "i2s.uib.no:8888/bigdata/#splash", because with your own namespace, only you can select and update your data.  
'''Task:'''
Write the following queries and updates with Python and rdflib:
* Print out a list of all the predicates used in your graph.
* Print out a sorted list of all the presidents represented in your graph.
* Create dictionary (Python ''dict'') with all the represented presidents as keys. For each key, the value is a list of names of people indicted under that president.
* Use an ASK query to investigate whether Donald Trump has pardoned more than 5 people.
* Use a DESCRIBE query to create a new graph with information about Donald Trump. Print out the graph in Turtle format.


===SPARQL programming in Python with SPARQLWrapper and Blazegraph===
'''Getting ready:'''
Make sure you have to access to a running Blazegraph as in [[Lab: SPARQL | Exercise 3: SPARQL]]. You can either run Blazegraph locally on your own machine (best) or online on a shared server at UiB (also ok).


Continue with the RDF graph you created in exercises 1-2 (perhaps creating a new namespace in Blazegraph first.)


*'''Redo all the SPARQL queries and updates from [https://wiki.uib.no/info216/index.php/Lab:_SPARQL Lab 4], this time writing a Python program that uses SPARQLWrapper to handle them.'''
'''Task:'''
 
Program the following queries and updates with SPARQLWrapper and Blazegraph.
* SELECT all triples in your graph.
* SELECT all the interests of Cade.
* SELECT the city and country of where Emma lives.
* SELECT only people who are older than 26.
* SELECT Everyone who graduated with a Bachelor Degree.
* Use SPARQL Update's DELETE DATA to delete that fact that Cade is interested in Photography. Run your SPARQL query again to check that the graph has changed.
 
* Use INSERT DATA to add information about Sergio Pastor, who lives in 4 Carrer del Serpis, 46021 Valencia, Spain. he has a M.Sc. in computer from the University of Valencia from 2008. His areas of expertise include big data, semantic technologies and machine learning.
 
* Write a SPARQL DELETE/INSERT update to change the name of "University of Valencia" to "Universidad de Valencia" whereever it occurs.


* Write a SPARQL DESCRIBE query to get basic information about Sergio.
* Use a DESCRIBE query to create an rdflib Graph about Oliver Stone. Print the graph out in Turtle format.


* Write a SPARQL CONSTRUCT query that returns that: any city in an address is a cityOf the country of the same address.
===SPARQL programming in Python with SPARQLWrapper and Blazegraph===
==With Blazegraph==  
==With Blazegraph==  
First, pip install SPARQLWrapper. If you are using Conda: open an Anaconda prompt, activate your environment with "conda activate [name of env]", and then "pip install sparqlwrapper" in the same prompt. Remember to select this Conda environment in your IDE.
First, pip install SPARQLWrapper. If you are using Conda: open an Anaconda prompt, activate your environment with "conda activate [name of env]", and then "pip install sparqlwrapper" in the same prompt. Remember to select this Conda environment in your IDE.

Revision as of 20:03, 30 January 2023

Topics

SPARQL programming in Python:

  • with rdflib: to manage an rdflib Graph internally in a program
  • with SPARQLWrapper and Blazegraph: to manage an RDF graph stored externally in Blazegraph (on your own local machine or on the shared online server)

Motivation: Last week we entered SPARQL queries and updates manually from the web interface. But in the majority of cases we want to program the management of triples in our graphs, for example to handle automatic or scheduled updates.

Tasks

SPARQL programming in Python with rdflib

Getting ready: No additional installation is needed. You are already running Python and rdflib.

Parse the file russia_investigation_kg.ttl into an rdflib Graph.

Task: Write the following queries and updates with Python and rdflib:

  • Print out a list of all the predicates used in your graph.
  • Print out a sorted list of all the presidents represented in your graph.
  • Create dictionary (Python dict) with all the represented presidents as keys. For each key, the value is a list of names of people indicted under that president.
  • Use an ASK query to investigate whether Donald Trump has pardoned more than 5 people.
  • Use a DESCRIBE query to create a new graph with information about Donald Trump. Print out the graph in Turtle format.

SPARQL programming in Python with SPARQLWrapper and Blazegraph

Getting ready: Make sure you have to access to a running Blazegraph as in Exercise 3: SPARQL. You can either run Blazegraph locally on your own machine (best) or online on a shared server at UiB (also ok).

Continue with the RDF graph you created in exercises 1-2 (perhaps creating a new namespace in Blazegraph first.)

Task: Program the following queries and updates with SPARQLWrapper and Blazegraph.

  • Use a DESCRIBE query to create an rdflib Graph about Oliver Stone. Print the graph out in Turtle format.

With Blazegraph

First, pip install SPARQLWrapper. If you are using Conda: open an Anaconda prompt, activate your environment with "conda activate [name of env]", and then "pip install sparqlwrapper" in the same prompt. Remember to select this Conda environment in your IDE. The most important part is that we need to import a SPARQLWrapper in order to connect to the SPARQL endpoint of Blazegraph.

When it comes to how to do some queries and updates I recommend scrolling down on this page for help: https://github.com/RDFLib/sparqlwrapper. There are also some examples on our example page.

Remember, before you can program with Blazegraph you have to make sure its running like we did in Lab 4. Make sure that the URL you use with SPARQLWrapper has the same address and port as the one you get from running it. Now you will be able to program queries and updates.

# How to establish connection to Blazegraph endpoint. Also a quick select example.

from SPARQLWrapper import SPARQLWrapper, JSON

namespace = "kb"
sparql = SPARQLWrapper("http://localhost:9999/blazegraph/namespace/"+ namespace + "/sparql")

sparql.setQuery("""
    PREFIX ex: <http://example.org/>
    SELECT * WHERE {
    ex:Cade ex:interest ?interest.
    }
""")
sparql.setReturnFormat(JSON)

results = sparql.query().convert()

for result in results["results"]["bindings"]:
    print(result["interest"]["value"])

The different types of queries requires different return formats:

  • SELECT and ASK: a SPARQL Results Document in XML, JSON, or CSV/TSV format.
  • DESCRIBE and CONSTRUCT: an RDF graph serialized, for example, in the TURTLE or RDF/XML syntax, or an equivalent RDF graph serialization.

Remember to make sure that you can see the changes that take place after your inserts.


Without Blazegraph

If you have not been able to run Blazegraph on your own computer yet, you can use the UiB blazegraph service: i2s.uib.no:8888/bigdata/#splash. Remember to create your own namespace like said above in the web-interface.

Alternatively, you can instead program SPARQL queries directly with RDFlib.

For help, look at the link below:

Querying with Sparql


Useful Readings

SPARQL Queries you can use for tasks

# SPARQL Queries

prefix ex: <http://example.org/>

# SELECT Every triple
SELECT * WHERE {?s ?p ?o}

# Select the interests of Cade
SELECT ?interest WHERE {ex:Cade ex:interest ?interest}

# SELECT only people who are older than 26
SELECT ?person ?age WHERE {?person ex:age ?age. FILTER(?age > 26)}

# SELECT The City and country of Cade
SELECT ?country ?city WHERE {ex:Cade ex:address ?address. ?address ex:country ?country. ?address ex:city ?city.}

# SELECT Everyone who graduated with a Bachelor Degree.
SELECT ?person ?level WHERE {?person ex:degree ?degree. ?degree ex:degreeLevel ?level. FILTER(?level="Bachelor")}

# DELETE Photography
PREFIX ex: <http://example.org/>
DELETE DATA

{
 ex:Cade ex:interest ex:Photography.
}


# INSERT Sergio

PREFIX ex: <http://example.org/>

INSERT DATA
{
 ex:Sergio ex:address ex:SergioAddress.
 ex:SergioAddress ex:city ex:Valencia.
 ex:SergioAddress ex:street "4 Carrer del Serpis".
 ex:SergioAddress ex:postalCode "46021".
 ex:SergioAddress ex:country ex:Spain.
 ex:Sergio ex:address ex:SergioDegree.
 ex:SergioDegree ex:degreeLevel "Master".
 ex:SergioDegree ex:degreeField ex:Computer_Science.
 ex:SergioDegree ex:degreeYear "2008".
 ex:SergioDegree ex:degreeSource ex:University_of_Valencia.
 ex:Sergio ex:expertise ex:Big_Data.
 ex:Sergio ex:expertise ex:Semantic_Technologies.
 ex:Sergio ex:expertise ex:Machine_Learning.
}


# DELETE Photography
PREFIX ex: <http://example.org/>
DELETE DATA

{
 ex:Cade ex:interest ex:Photography.
}

# DELETE/INSERT University

prefix ex: <http://example.org/>

DELETE {
  ?s ?p ex:University_of_Valencia.
}
INSERT {?s ?p ex:Universidad_de_Valencia.}

WHERE {
?s ?p ex:University_of_Valencia.}


# Construct

prefix ex: <http://example.org/>

CONSTRUCT {?city ex:cityOf ?country}
WHERE {?address ex:city ?city. ?address ex:country ?country}


Triples that you can base your queries on: (turtle format)

@prefix ex: <http://example.org/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ex:Cade a foaf:Person ;
    ex:address [ a ex:Address ;
            ex:city ex:Berkeley ;
            ex:country ex:USA ;
            ex:postalCode "94709"^^xsd:string ;
            ex:state ex:California ;
            ex:street "1516_Henry_Street"^^xsd:string ] ;
    ex:age 27 ;
    ex:characteristic ex:Kind ;
    ex:degree [ ex:degreeField ex:Biology ;
            ex:degreeLevel "Bachelor"^^xsd:string ;
            ex:degreeSource ex:University_of_California ;
            ex:year "2011-01-01"^^xsd:gYear ] ;
    ex:interest ex:Bird,
        ex:Ecology,
        ex:Environmentalism,
        ex:Photography,
        ex:Travelling ;
    ex:married ex:Mary ;
    ex:meeting ex:Meeting1 ;
    ex:visit ex:Canada,
        ex:France,
        ex:Germany ;
    foaf:knows ex:Emma ;
    foaf:name "Cade_Tracey"^^xsd:string .

ex:Mary a ex:Student,
        foaf:Person ;
    ex:age 26 ;
    ex:characteristic ex:Kind ;
    ex:interest ex:Biology,
        ex:Chocolate,
        ex:Hiking .

ex:Emma a foaf:Person ;
    ex:address [ a ex:Address ;
            ex:city ex:Valencia ;
            ex:country ex:Spain ;
            ex:postalCode "46020"^^xsd:string ;
            ex:street "Carrer_de_la Guardia_Civil_20"^^xsd:string ] ;
    ex:age 26 ;
    ex:degree [ ex:degreeField ex:Chemistry ;
            ex:degreeLevel "Master"^^xsd:string ;
            ex:degreeSource ex:University_of_Valencia ;
            ex:year "2015-01-01"^^xsd:gYear ] ;
    ex:expertise ex:Air_Pollution,
        ex:Toxic_Waste,
        ex:Waste_Management ;
    ex:interest ex:Bike_Riding,
        ex:Music,
        ex:Travelling ;
    ex:meeting ex:Meeting1 ;
    ex:visit ( ex:Portugal ex:Italy ex:France ex:Germany ex:Denmark ex:Sweden ) ;
    foaf:name "Emma_Dominguez"^^xsd:string .

ex:Meeting1 a ex:Meeting ;
    ex:date "August, 2014"^^xsd:string ;
    ex:involved ex:Cade,
        ex:Emma ;
    ex:location ex:Paris .

ex:Paris a ex:City ;
    ex:capitalOf ex:France ;
    ex:locatedIn ex:France .

ex:France ex:capital ex:Paris .