Lab: Ontop: Difference between revisions
(5 intermediate revisions by the same user not shown) | |||
Line 16: | Line 16: | ||
===Installing and running SQLite3=== | ===Installing and running SQLite3=== | ||
'''Task:''' Download and install a relational database system on your computer. | '''Task:''' Download and install a relational database system on your computer. If you are not experienced in databases, we recommend the very light-weight and easy-to-install [https://sqlite.org/download.html SQLite3]. | ||
(Unfortunately, SQLite3 is not fully integrated with Ontop. A more powerful alternative is H2. At the end of this exercise you will find guidelines for doing the exercise with H2 instead.) | |||
'''Task:''' Create a folder on your local computer for this exercise. Go to the new folder and open a Console or Terminal window. Start SQLite3 with the name of a new database, for example 'MyTestDB.sqlite'. | '''Task:''' Create a folder on your local computer for this exercise. Go to the new folder and open a Console or Terminal window. Start SQLite3 with the name of a new database, for example 'MyTestDB.sqlite'. | ||
Line 191: | Line 193: | ||
'''Task:''' Use ''FILTER'' and the ''LANG'' function to only list president descriptions with language ''"en"''. | '''Task:''' Use ''FILTER'' and the ''LANG'' function to only list president descriptions with language ''"en"''. | ||
''...from here we could have gone on to add the Ontop repository too as a local repository, but Ontop does not support SQLite3 fully, unfortunately. Using H2 instead of SQLite3 is an option, we outline below.'' | ''...from here we could have gone on to add the Ontop repository too as a local repository, but Ontop does not support SQLite3 fully, unfortunately. Using H2 instead of SQLite3 is an option, as we outline below.'' | ||
''Tip:'' Here are some useful PREFIXES: | ''Tip:'' Here are some useful PREFIXES: | ||
Line 211: | Line 213: | ||
'''Task:''' [https://www.h2database.com/html/download.html Download] and [https://www.h2database.com/html/installation.html install H2]. | '''Task:''' [https://www.h2database.com/html/download.html Download] and [https://www.h2database.com/html/installation.html install H2]. | ||
Again, | Again, Ontop needs an additional ''.jar''-file - a JDBC driver - for connecting to your H2 database. You will find this file in the folder where you installed H2. It is called something like ''h2-2.3.232.jar''. Place a copy of it in ''lib/''-subfolder of your GraphDB installation folder (same place where you put the ''sqlite-jdbc-3.49.0.0.jar''-file). Restart Ontop. | ||
'''Task:''' Start H2. Normally it opens a browser window on port ''8082'', but there is a console (command line) option too. | '''Task:''' Start H2. Normally it opens a browser window on port ''8082'', but there is a console (command line) option too. | ||
* Default driver class: ''org.h2.Driver'' - this one is fine | * Default driver class: ''org.h2.Driver'' - this one is fine | ||
* Default JDBC URL: ''jdbc:h2:~/test'' - you can replace the ~ with a full path and perhaps give the DB a better name too | * Default JDBC URL: ''jdbc:h2:~/test'' - if you want, you can replace the ''~'' with a full path and perhaps give the DB a better name than ''test'' too | ||
You do not need a password or to change the user name. | You do not need a password or to change the user name. | ||
'''Task:''' Click connect and define your database. You can cut-and-paste the same SQL commands as before to create the ''Investigations'' and ''InvestigatedPeople'' databases. But | '''Task:''' Click connect and define your database. You can cut-and-paste the same SQL commands as before to create the ''Investigations'' and ''InvestigatedPeople'' databases. But be aware of a few critical differences: | ||
* H2 uses ''schemas'' to organise tables. So first you need to create one for your two tables, for example: | * H2 uses ''schemas'' to organise tables. So first you need to create one for your two tables, for example: | ||
CREATE SCHEMA MyTestSchema ; | CREATE SCHEMA MyTestSchema ; | ||
* After that, you must always | * After that, you must always prefix your table names with the schema: ''MyTestSchema.Investigations'' and ''MyTestSchema.InvestigatedPeople'' - both in your SQL commands and in your OBDA mappings. | ||
* If you do not write schema, table, and column names inside "quotation marks", H2 will convert them to UPPER CASE. | * If you do not write schema, table, and column names inside "quotation marks", H2 will automatically convert them to UPPER CASE. | ||
'''Task:''' Create a new virtual repository like you did before, but using the driver class and JDBC URL for H2 instead. You need to enter a H2 user name too. This virtual repository is more stable than the one that wraps SQLite3. | '''Task:''' Create a new virtual repository like you did before, but using the driver class and JDBC URL for H2 instead. You need to enter a H2 user name too. This virtual repository is more stable than the one that wraps SQLite3. | ||
You may need to disconnect from H2 before you connect | You may need to disconnect from H2 in the browser window before you connect to it via GraphDB and Ontop. | ||
'''Task:''' Add the new virtual repository to the FedX federated repository you created before. You can now write a SPARQL query that combines triples from all the three repositories it contains, e.g., | '''Task:''' Add the new virtual repository to the FedX federated repository you created before. You can now write a SPARQL query that combines triples from all the three repositories it contains, e.g., | ||
* For all investigations, list its name, who was president, that president's Wikidata URL, and a brief description in English. | * For all investigations, list its name, who was president, that president's Wikidata URL, and a brief description of the president in English. |
Latest revision as of 18:09, 25 February 2025
Topics
- Creating SQLite3 databases
- Creating OBDA or R2RML mappings
- Setting up Ontop in GraphDB
- Querying SQLite3 databases through a virtual Ontop KG
Useful materials
- The SQLite project page.
- Accessing relational databases with data virtualization (GraphDB's Ontop documentation)
- The ontop-examples repository on GitHub contains many OBDA mapping examples
- Section 2 in SPARQL 1.1 Federated Query
Tasks
We recommend you run this exercise through the Ontop plugin to the free desktop version of OntoText's GraphDB tool. You can also use one of the open-source versions of Ontop.
Installing and running SQLite3
Task: Download and install a relational database system on your computer. If you are not experienced in databases, we recommend the very light-weight and easy-to-install SQLite3.
(Unfortunately, SQLite3 is not fully integrated with Ontop. A more powerful alternative is H2. At the end of this exercise you will find guidelines for doing the exercise with H2 instead.)
Task: Create a folder on your local computer for this exercise. Go to the new folder and open a Console or Terminal window. Start SQLite3 with the name of a new database, for example 'MyTestDB.sqlite'.
Task: Based on the example files Fig6.3-ReportClaimantTable-original.sql in the Examples folder in mitt.uib.no, create the following two tables and fill them with test data:
Table InvestigatedPeople:
InvestigationID | American | CPDate | CPDays | IndictmentDays | Investigation | Name | Outcome | Overturned | Pardoned |
investigation_114 | true | None | None | None | bush-clinton-passport | None | None | false | false |
investigation_100 | true | 1993-10-26 | 1335 | 789 | pierce | Deborah Gore Dean | conviction | false | false |
investigation_113 | true | None | None | None | sealed-investigation-hwbush-2 | None | None | false | false |
investigation_10 | true | 1975-01-01 | 592 | 286 | watergate | Robert C. Mardian | conviction | true | false |
investigation_11 | true | None | None | -9 | watergate | Maurice H. Stans | indictment | false | false |
investigation_115 | true | 1994-03-22 | -136 | -316 | whitewater | David L. Hale | guilty-plea | false | false |
Table Investigations:
Investigation | InvestigationDays | InvestigationStart | InvestigationEnd | President |
bush-clinton-passport | 1081 | 1992-12-14 | 1995-11-30 | Bill Clinton |
pierce | 3162 | 1990-03-01 | 1998-10-27 | George H.W. Bush |
sealed-investigation-hwbush-2 | 453 | 1991-04-19 | 1992-07-15 | George H.W. Bush |
watergate | 1492 | 1973-05-19 | 1977-06-19 | Richard Nixon |
whitewater | 2770 | 1994-08-05 | 2002-03-06 | Bill Clinton |
Tip: You can either enter SQL statements interactively one by one, or you can save them to a file and use SQLite3's .read command. Use the .exit command when you are done.
Tip: Although SQLite3 does not require it, Ontop will later expect that you have declared all PRIMARY KEYS as NOT NULL (for example: "InvestigationID text NOT NULL PRIMARY KEY,").
(If you want more data, the full CSV file is available on GitHub.)
OBDA and R2RML mappings
Task: Draw an example graph on paper that represents the investigation_100 and pierce rows of the two tables.
Task: Based on the example files Fig6.4-ReportClaimant-mapping.obda and/or Fig6.4-ReportClaimant-mapping.r2rml in the Examples folder, create a mapping from the two tables into an RDF graph.
GraphDB Ontop configuration
Task: You already have Ontop as part of your GraphDB installation, but you need an additional .jar-file - a JDBC driver - for connecting to your database of choice. For SQLite3, you can download the driver from Xenial on GitHub. Place the .jar-file in the lib/-subfolder of your GraphDB installation folder(*). Restart Ontop.
(*) On Linux, the path is <ONTOP_INSTALL_DIR>/lib/app/lib/sqlite-jdbc-3.49.0.0.jar .
Task:
- Create a new Ontop repository.
- The Driver class should be
org.sqlite.JDBC
. The JDBC URL should bejdbc:sqlite:PATH
, where PATH is the location of your SQLite3 database, for examplejdbc:sqlite:C:/Users/MyUser/OntopLab/MyTestDB.sqlite
. - Load the OBDA or R2RML mapping file.
- Click Create and activate the new repository.
Running SPARQL queries
Task: Write SPARQL queries that answer the following questions:
- List all investigations in your graph.
- Count the number of investigations in your graph.
- List all investigated people in your graph.
- List all investigated people in your graph along with the investigations they were involved in.
- You can also re-do the following queries from before (you may need to load more data):
- List everyone who pleaded guilty, along with the name of the investigation.
- List everyone who were convicted, but who had their conviction overturned by which president.
- For each investigation, list the number of indictments made.
- For each investigation with multiple indictments, list the number of indictments made.
- For each investigation with multiple indictments, list the number of indictments made, sorted with the most indictments first.
If you have more time
Task: Create a new No inference (plain RDF) GraphDB repository and import triples from the file Presidents.ttl in the Examples folder in mitt.uib.no.
Activate the new repository. Its triples connect the names of recent US presidents to their Wikidata URIs. Run a SPARQL SELECT query to list them all.
Task: Create a new Federated (FedX Virtual SPARQL) repository.
- As a local repository, add the new Presidents repository.
- As a remote repository, add https://query.wikidata.org/sparql as a Generic SPARQL endpoint.
Activate the new repository.
Task: Re-run the SPARQL SELECT query that lists the names of recent US presidents along with their Wikidata URIs.
Task: Extend the SPARQL SELECT query to also list the schema:description of each US president.
Task: Use FILTER and the LANG function to only list president descriptions with language "en".
...from here we could have gone on to add the Ontop repository too as a local repository, but Ontop does not support SQLite3 fully, unfortunately. Using H2 instead of SQLite3 is an option, as we outline below.
Tip: Here are some useful PREFIXES:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX schema: <http://schema.org/>
If you really have a lot of time
It is possible to create a Federated (FedX Virtual SPARQL) repository that federates
- a local repository (like the Presidents repository),
- a remote repository (for example https://query.wikidata.org/sparql as a Generic SPARQL endpoint), and
- a virtual repository (like the Investigations and InvestigatedPeople repository) as long as it runs on top of H2 instead of SQLite3.
Task: Install a recent Java on your computer if you do not have already, for example the open version from Adoptium.
java -version
must show version 11 or later, and Java cannot be blocked by your firewall.
Task: Download and install H2.
Again, Ontop needs an additional .jar-file - a JDBC driver - for connecting to your H2 database. You will find this file in the folder where you installed H2. It is called something like h2-2.3.232.jar. Place a copy of it in lib/-subfolder of your GraphDB installation folder (same place where you put the sqlite-jdbc-3.49.0.0.jar-file). Restart Ontop.
Task: Start H2. Normally it opens a browser window on port 8082, but there is a console (command line) option too.
- Default driver class: org.h2.Driver - this one is fine
- Default JDBC URL: jdbc:h2:~/test - if you want, you can replace the ~ with a full path and perhaps give the DB a better name than test too
You do not need a password or to change the user name.
Task: Click connect and define your database. You can cut-and-paste the same SQL commands as before to create the Investigations and InvestigatedPeople databases. But be aware of a few critical differences:
- H2 uses schemas to organise tables. So first you need to create one for your two tables, for example:
CREATE SCHEMA MyTestSchema ;
- After that, you must always prefix your table names with the schema: MyTestSchema.Investigations and MyTestSchema.InvestigatedPeople - both in your SQL commands and in your OBDA mappings.
- If you do not write schema, table, and column names inside "quotation marks", H2 will automatically convert them to UPPER CASE.
Task: Create a new virtual repository like you did before, but using the driver class and JDBC URL for H2 instead. You need to enter a H2 user name too. This virtual repository is more stable than the one that wraps SQLite3.
You may need to disconnect from H2 in the browser window before you connect to it via GraphDB and Ontop.
Task: Add the new virtual repository to the FedX federated repository you created before. You can now write a SPARQL query that combines triples from all the three repositories it contains, e.g.,
- For all investigations, list its name, who was president, that president's Wikidata URL, and a brief description of the president in English.