Lab: Getting started with VSCode, Python and RDFlib: Difference between revisions

From info216
No edit summary
Line 24: Line 24:
==Tasks==
==Tasks==
===Install Python, Pip, VSCode, and RDFlib===
===Install Python, Pip, VSCode, and RDFlib===
You need to have Python version >= 3.7 on your computer. Use the command ''python --version'' in a command/terminal window to check. You can download Python and Pip [https://www.python.org/downloads/ here]. To ensure you have the most recent Pip, you can do
'''1)''' You need to have Python version >= 3.7 on your computer. Use the command ''python --version'' in a command/terminal window to check. You can download Python and Pip [https://www.python.org/downloads/ here]. To ensure you have the most recent Pip, you can do
  python -m pip install --upgrade pip
  python -m pip install --upgrade pip


You need to have an Integrated Development Environment (IDE) that supports Python. If you are unsure, you can download the free and open source Visual Studio Code (VSCode) [https://code.visualstudio.com/Download here].
'''2)''' You need to have an Integrated Development Environment (IDE) that supports Python. If you are unsure, you can download the free and open source Visual Studio Code (VSCode) [https://code.visualstudio.com/Download here].


Create a folder for your INFO216 exercises. Start VSCode and create a new project (from the File menu) by opening your exercise folder folder. Create a new file with ''.py'' extension.  
'''3)''' Create a folder for INFO216. Start VSCode and ''create a workspace'' in the file menu (File Menu --> Save Workspace As) and save it in your folder. Afterwards, on the left side of VSCode, click on the document icon (explorer). Click Open Folder, and open your INFO216 folder. Create a new file with ''.py'' extension.  


Go to your INFO216 exercise folder. You can do this in a command/terminal window outside VScode or use the Terminal menu to create a terminal inside VScode. In the terminal, create and activate a virtual environment. It is easiest to use ''pip'', but it is ok to use ''pipenv'' or ''conda'' of you prefer.
'''4)''' You will be asked to install the Python extension, install it. If you weren't asked, on the left side of VSCode click on the 4 cubes (extension manager). Within here search for Microsoft's Python extension and install it.  
python -m venv venv


You can create the folder and file in the terminal or in windows explorer.  
'''5)''' If you don't have your terminal open, go to the top menu and click on terminal, and then ''New Terminal''. Check where your terminal window is currently located. The bottom line starting with ''PS'' shows where it's located. If you added the folder earlier, then you should be located in your INFO216 folder. However, if the destination after PS is not your INFO216 folder, you need to locate to this folder. You can move through folders with the ''cd'' command in the terminal. For instance, if you are at ''PS C:\Users\YourName>'' and your INFO216 folder is at your desktop, you could type the following ''cd .\Desktop\INFO216\''.   
Install Microsoft's Python extension in the VScode extension manager. When the Python extension is installed you can use the 'select interpreter' field on the bottom left to use the virtual environment you made, or make sure you are using a supported version of Python.  
 
'''6)''' If you are correctly located, type in the following command into your terminal window
In the terminal, and inside your Python environment, install RDFlib:
 
(Windows)
py -3 -m venv .venv
.venv\scripts\activate
 
(Mac / Linux)
python3 -m venv .venv
source .venv/bin/activate
 
'''7)''' This should now have created a virtual environment in your folder called ''.venv''. In the bottom right corner you will receive a notification asking your to select the new environment for the workspace folder, select yes. You should now see a green ''(.venv)'' in front of PS in the terminal window. Your virtual environment should now automatically be selected when you open your workspace. However, sometimes you might need to open a new terminal for the green (.venv) to appear.
 
'''8)''' In the terminal, type the following to install RDFlib:
  pip install rdflib
  pip install rdflib
'''9)''' Close and open VSCode. When running your program for the first time, you will be asked to install ''ipykernel package'', click install.
You can now ''import rdflib'' into your ''.py'' file, or import specific classes/inferfaces such as ''from rdflib import Namespace, Graph''.
You can now ''import rdflib'' into your ''.py'' file, or import specific classes/inferfaces such as ''from rdflib import Namespace, Graph''.



Revision as of 18:06, 16 January 2023

Topics

  1. Prepare for programming knowledge graphs with rdflib in Python.
  2. Get started with basic RDF programming.

Useful materials

VSCode:

RDFlib:

RDFlib classes/interfaces and methods:

  • Graph (add and perhaps remove methods)
  • URIRef
  • Literal
  • Namespace
  • perhaps also: RDF (the RDF.type field) and BNode

Tasks

Install Python, Pip, VSCode, and RDFlib

1) You need to have Python version >= 3.7 on your computer. Use the command python --version in a command/terminal window to check. You can download Python and Pip here. To ensure you have the most recent Pip, you can do

python -m pip install --upgrade pip

2) You need to have an Integrated Development Environment (IDE) that supports Python. If you are unsure, you can download the free and open source Visual Studio Code (VSCode) here.

3) Create a folder for INFO216. Start VSCode and create a workspace in the file menu (File Menu --> Save Workspace As) and save it in your folder. Afterwards, on the left side of VSCode, click on the document icon (explorer). Click Open Folder, and open your INFO216 folder. Create a new file with .py extension.

4) You will be asked to install the Python extension, install it. If you weren't asked, on the left side of VSCode click on the 4 cubes (extension manager). Within here search for Microsoft's Python extension and install it.

5) If you don't have your terminal open, go to the top menu and click on terminal, and then New Terminal. Check where your terminal window is currently located. The bottom line starting with PS shows where it's located. If you added the folder earlier, then you should be located in your INFO216 folder. However, if the destination after PS is not your INFO216 folder, you need to locate to this folder. You can move through folders with the cd command in the terminal. For instance, if you are at PS C:\Users\YourName> and your INFO216 folder is at your desktop, you could type the following cd .\Desktop\INFO216\.

6) If you are correctly located, type in the following command into your terminal window

(Windows)

py -3 -m venv .venv
.venv\scripts\activate

(Mac / Linux)

python3 -m venv .venv
source .venv/bin/activate

7) This should now have created a virtual environment in your folder called .venv. In the bottom right corner you will receive a notification asking your to select the new environment for the workspace folder, select yes. You should now see a green (.venv) in front of PS in the terminal window. Your virtual environment should now automatically be selected when you open your workspace. However, sometimes you might need to open a new terminal for the green (.venv) to appear.

8) In the terminal, type the following to install RDFlib:

pip install rdflib

9) Close and open VSCode. When running your program for the first time, you will be asked to install ipykernel package, click install.

You can now import rdflib into your .py file, or import specific classes/inferfaces such as from rdflib import Namespace, Graph.

Programming tasks

Task: Represent the sentences below as triples. Note that some sentences can result in several triples.

  • The Mueller Investigation was lead by Robert Mueller.
    • It involved Paul Manafort, Rick Gates, George Papadopoulos, Michael Flynn, Michael Cohen, and Roger Stone.
  • Paul Manafort was business partner of Rick Gates.
    • He was campaign chairman for Donald Trump
    • He was charged with money laundering, tax evasion, and foreign lobbying.
    • He was convicted for bank and tax fraud.
    • He pleaded guilty to conspiracy.
    • He was sentenced to prison.
    • He negotiated a plea agreement.
  • Rick Gates was charged with money laundering, tax evasion and foreign lobbying.
    • He pleaded guilty to conspiracy and lying to FBI.

Task: Write a program that creates an RDF graph and adds the triples you just created.

For the URIs, you can just use an example path like http://example.org/. So if you want to represent Donald Trump, the URI could be http://example.org/Donald_Trump, and you can create the resource like this:

from rdflib import URIRef

donaldTrump = URIRef('http://example.org/Donald_Trump')

You can even use a Namespace so you don't have to write the full URI every time:

from rdflib import Namespace

ex = Namespace('http://example.org/')
donaldTrump = ex.Donald_Trump

Task: Use the serialize method of rdflib.Graph to write out the model in different formats (on screen or to file):

  • Turtle (format='ttl')
  • N-Triple (format='nt')
  • JSON-LD (format='json-ld')
  • RDF-XML (format='xml')

Which one is easiest to read? What are the pros and cons of the different formats? We will look more at some of them later in the course!

Task: Use the simple online RDF grapher to visualise your model. :isSemantic offers a more advanced RDF visualiser that you can also test if you want.

Task: Loop through the triples in the model to print out all triples that have pleading guilty as predicate. If you have been inconsistent about some predicate or other term, you can first write loops that correct wrong terms everywhere in the model. (Tip: to correct a term in a model, you typically have to first remove the old triple and then add a new one.)

If you have more time...

Task: If you have more time you can continue extending your graph:

  • Michael Cohen was Donald Trump's attorney.
    • He pleaded guilty for lying to Congress.
  • Michael Flynn was adviser to Donald Trump.
    • He pleaded guilty for lying to the FBI.
    • He negotiated a plea agreement.

Task: According to this FRONTLINE article, Gates', Cohen's and Flynn's lying were different and are described in different detail. How can you modify your knowledge graph to account for this?

Task: Write a method (function) that submits your model to https://www.ldf.fi/service/rdf-grapher for rendering and saves the returned image to file.