Lab: SPARQL: Difference between revisions
(Created page with "=Lab 3: Group project ideas / SPARQL= ==Topics== * Meeting with Andreas to discuss group project idea. * Setting up the Blazegraph graph database. Previously we have only sto...") |
|||
(105 intermediate revisions by 8 users not shown) | |||
Line 1: | Line 1: | ||
==Topics== | ==Topics== | ||
* Setting up GraphDB | |||
* Setting up | * SPARQL queries and updates | ||
* SPARQL queries and updates | |||
== | ==Useful materials== | ||
GraphDB documentation: | |||
* [https://graphdb.ontotext.com/documentation/10.8/ Getting Started with GraphDB] | |||
Introduction to SPARQL: | |||
* [https://graphdb.ontotext.com/documentation/10.8/sparql.html Getting Started with SPARQL] | |||
SPARQL reference: | |||
* [https://www.w3.org/TR/sparql11-query/ SPARQL Query Documentation] | |||
<!-- | |||
* [http://www.w3.org/TR/sparql11-update/ SPARQL Update Documentation] | |||
--> | |||
* [https://en.wikibooks.org/wiki/SPARQL/Expressions_and_Functions SPARQL Expressions and Functions] | |||
==Tasks== | ==Tasks== | ||
We recommend you download and install the free desktop version of OntoText's GraphDB to run the SPARQL exercises. | |||
If you do not like proprietary software, it is still possible to do most of the exercises using Blazegraph, which you can [https://blazegraph.com/ download here] (requires Java). Blazegraph is a powerful open-source tool, but GraphDB offers even more functionality and is what the lab leader will prepare for this semester. | |||
===Installing and running GraphDB=== | |||
Follow the instructions in [https://graphdb.ontotext.com/documentation/10.8/ Getting Started with GraphDB] to download and install GraphDB. | |||
From the [https://graphdb.ontotext.com/documentation/10.8/graphdb-desktop-installation.html Desktop Installation page] you can click on ''"GraphDB download page"'' and then on ''"Download GraphDB"'' to register and request to download ''GraphDB Free''. | |||
When GraphDB has been properly installed and is started, it should open in a web browser window at the address http://localhost:7200/ . | |||
===Setting up a repository=== | |||
Follow the instructions in [https://graphdb.ontotext.com/documentation/10.8/ Getting Started with GraphDB] to create a new GraphDB Repository called, for example, ''info216_lab2_NN'', where ''NN'' are your initials. Choose ''No inference'' for now. Otherwise, the default parameters are fine. | |||
Connect to the new repository and pin it as your default repository. | |||
===Load data=== | |||
Download the Turtle file [[File:russia_investigation_kg.txt]], and save it with the correct extension, as ''russia_investigation_kg.ttl'' (not ''.txt''). (You can also experiment with the Turtle file you saved after exercises 1 Load the Russia_investigation data through the GraphDB Workbench as described in the QuickStart guide. | |||
You can use ''http://example.org/'' as Base IRI. | |||
===Graph visualisation=== | |||
Go to ''Explore'' -> ''Visual graph'' and create an ''Easy graph'' around the resource ''http://example.org#investigation_0''. Double-click on nodes to expand them. Are there any more investigations related to ''Richard Nixon''? | |||
== | ===SPARQL tasks=== | ||
Go to the ''SPARQL Query & Update'' tab. | |||
'''Task:''' | |||
Using the data in ''russia_investigation_kg.ttl'', write the following SPARQL SELECT queries. | |||
([[Russian investigation KG | This page explains]] the Russian investigation KG a bit more.) | |||
* List all triples in your graph. | |||
* List the first 100 triples in your graph. | |||
* Count the number of triples in your graph. | |||
* Count the number of indictments in your graph. | |||
* List everyone who pleaded guilty, along with the name of the investigation. | |||
* List everyone who were convicted, but who had their conviction overturned by which president. | |||
* For each investigation, list the number of indictments made. | |||
* For each investigation with multiple indictments, list the number of indictments made. | |||
* For each investigation with multiple indictments, list the number of indictments made, sorted with the most indictments first. | |||
* For each president, list the numbers of convictions and of pardons made after conviction. | |||
== | ==If you have more time== | ||
'''Task:''' Try to program some of the queries in a Python program (this will be the topic of later labs). You have two options: | |||
''Using rdflib:'' | |||
Read the Turtle file into an rdflib Graph and use the ''query()'' method. | |||
g = Graph() | |||
g.parse(..., format='ttl') | |||
r = g.query(...your_query_string...) | |||
The hard part is picking the results out of the object ''r''... | |||
''Using SPARQLwrapper:'' | |||
You can use SPARQLwrapper (another Python API) to connect to your running GraphDB endpoint. See the Python example page for how to do this. | |||
'''Task:''' If you want to explore more, try out the Wikidata Query Service (WDQS): | |||
* [https://query.wikidata.org/ Wikidata Query Service] | |||
WDQS tutorials: | |||
* [https://www.wikidata.org/wiki/Wikidata:SPARQL_tutorial Wikidata SPARQL tutorial] | |||
* [https://wdqs-tutorial.toolforge.org/ Interactive WDQS tutorial] |
Latest revision as of 12:22, 30 January 2025
Topics
- Setting up GraphDB
- SPARQL queries and updates
Useful materials
GraphDB documentation:
Introduction to SPARQL:
SPARQL reference:
Tasks
We recommend you download and install the free desktop version of OntoText's GraphDB to run the SPARQL exercises.
If you do not like proprietary software, it is still possible to do most of the exercises using Blazegraph, which you can download here (requires Java). Blazegraph is a powerful open-source tool, but GraphDB offers even more functionality and is what the lab leader will prepare for this semester.
Installing and running GraphDB
Follow the instructions in Getting Started with GraphDB to download and install GraphDB.
From the Desktop Installation page you can click on "GraphDB download page" and then on "Download GraphDB" to register and request to download GraphDB Free.
When GraphDB has been properly installed and is started, it should open in a web browser window at the address http://localhost:7200/ .
Setting up a repository
Follow the instructions in Getting Started with GraphDB to create a new GraphDB Repository called, for example, info216_lab2_NN, where NN are your initials. Choose No inference for now. Otherwise, the default parameters are fine.
Connect to the new repository and pin it as your default repository.
Load data
Download the Turtle file File:Russia investigation kg.txt, and save it with the correct extension, as russia_investigation_kg.ttl (not .txt). (You can also experiment with the Turtle file you saved after exercises 1 Load the Russia_investigation data through the GraphDB Workbench as described in the QuickStart guide.
You can use http://example.org/ as Base IRI.
Graph visualisation
Go to Explore -> Visual graph and create an Easy graph around the resource http://example.org#investigation_0. Double-click on nodes to expand them. Are there any more investigations related to Richard Nixon?
SPARQL tasks
Go to the SPARQL Query & Update tab.
Task: Using the data in russia_investigation_kg.ttl, write the following SPARQL SELECT queries. ( This page explains the Russian investigation KG a bit more.)
- List all triples in your graph.
- List the first 100 triples in your graph.
- Count the number of triples in your graph.
- Count the number of indictments in your graph.
- List everyone who pleaded guilty, along with the name of the investigation.
- List everyone who were convicted, but who had their conviction overturned by which president.
- For each investigation, list the number of indictments made.
- For each investigation with multiple indictments, list the number of indictments made.
- For each investigation with multiple indictments, list the number of indictments made, sorted with the most indictments first.
- For each president, list the numbers of convictions and of pardons made after conviction.
If you have more time
Task: Try to program some of the queries in a Python program (this will be the topic of later labs). You have two options:
Using rdflib: Read the Turtle file into an rdflib Graph and use the query() method.
g = Graph() g.parse(..., format='ttl') r = g.query(...your_query_string...)
The hard part is picking the results out of the object r...
Using SPARQLwrapper: You can use SPARQLwrapper (another Python API) to connect to your running GraphDB endpoint. See the Python example page for how to do this.
Task: If you want to explore more, try out the Wikidata Query Service (WDQS):
WDQS tutorials: