Solution examples 2023: Difference between revisions

From info216
 
(2 intermediate revisions by the same user not shown)
Line 88: Line 88:
'''RDFS rules'''
'''RDFS rules'''


Question 61:
'''Question 61:'''
A resource that is a director_of something is a director.
A resource that is a director_of something is a director.


  :director_of rdfs:domain :Director .
  :director_of rdfs:domain :Director .


Question 62:
'''Question 62:'''
A resource that something else is a director_of is a movie.
A resource that something else is a director_of is a movie.


  :director_of rdfs:range :Movie .
  :director_of rdfs:range :Movie .


Question 63:
'''Question 63:'''
The year of something has type xsd:year.
The year of something has type xsd:year.


  :year rdfs:range xsd:year .
  :year rdfs:range xsd:year .


Question 64:
'''Question 64:'''
An actor is a Person.
An actor is a Person.


  :Actor rdfs:subClassOf foaf:Person .
  :Actor rdfs:subClassOf foaf:Person .


Question 65:
'''Question 65:'''
A director is a person.
A director is a person.


Line 116: Line 116:
'''OWL axioms'''
'''OWL axioms'''


Question 66:
'''Question 66:'''
Nothing can be both a person and a movie.
Nothing can be both a person and a movie.


  :Person owl:disjointWith :Movie .
  :Person owl:disjointWith :Movie .


 
'''Question 67:'''
Question 67:
Nothing can be more than one of a person, a role, or a movie.
Nothing can be more than one of a person, a role, or a movie.


Line 128: Line 127:
     owl:disjointClasses ( :Person :Role :Movie ) .
     owl:disjointClasses ( :Person :Role :Movie ) .


Question 68:
'''Question 68:'''
Something that plays in at least one Movie is an Actor.
Something that plays in at least one Movie is an Actor.


Line 137: Line 136:
  ]
  ]


Question 69:
'''Question 69:'''
A LeadActor is an Actor that plays at least one LeadRole.
A LeadActor is an Actor that plays at least one LeadRole.


Line 149: Line 148:
==Task 4: SPARQL queries==
==Task 4: SPARQL queries==


Question 70:
'''Question 70:'''
Count the number of movies that are represented in the graph.
Count the number of movies that are represented in the graph.


Line 158: Line 157:
</pre>
</pre>


Question 71:
'''Question 71:'''
List the titles and years of all movies.
List the titles and years of all movies.


Line 169: Line 168:
</pre>
</pre>


Question 72:
'''Question 72:'''
List the titles and years of all movies since 2000.
List the titles and years of all movies since 2000.


Line 191: Line 190:
</pre>
</pre>


Question 73:
'''Question 73:'''
List the titles and years of all movies sorted first by year, then by name.
List the titles and years of all movies sorted first by year, then by name.


Line 203: Line 202:
</pre>
</pre>


Question 74:
'''Question 74:'''
Count the number of movies for each year with more than one movie.
Count the number of movies for each year with more than one movie.


Line 215: Line 214:
</pre>
</pre>


Question 75:
'''Question 75:'''
List the names of all persons that are both directors and actors.
List the names of all persons that are both directors and actors.


Line 225: Line 224:
</pre>
</pre>


Question 76:
'''Question 76:'''
List the actor name and movie title for all lead roles.
List the actor name and movie title for all lead roles.


Line 236: Line 235:
</pre>
</pre>


Question 77:
'''Question 77:'''
List all distinct pairs of actor names that have played lead roles in the same movies.
List all distinct pairs of actor names that have played lead roles in the same movies.


Line 251: Line 250:
</pre>
</pre>


Question 78-79:  
'''Question 78-79:'''
<br>SPARQL Update ''TBD.''
<br>SPARQL Update ''TBD.''




*** Examples related to the programming task
==Task 5: Programming==
 
'''Question 80:'''
<pre>
from rdflib import Namespace, Graph, Literal, RDF, DC, FOAF, XSD
 
BASE_URI = 'http://example.org/'
MOVIE = Namespace(BASE_URI)


g = Graph()
g.bind('', MOVIE)
g.bind('dc', DC)
g.bind('foaf', FOAF)
</pre>
'''Question 81:'''
<pre>
def add_movie_triples(g, row):
    movie = row.to_dict()
    # example dict:
    # {'Movie': 'Pulp_Fiction', 'Director': 'Quentin_Tarantino', 'Year': 1994}
    movie_name = movie['Movie']
    director_name = movie['Director']
    movie_year = movie['Year']
    # update g with a set of triples that represent the movie and its director
    g.add((MOVIE[director_name], RDF.type, MOVIE.Director))
    g.add((MOVIE[director_name], FOAF.name, Literal(director_name)))
    g.add((MOVIE[director_name], MOVIE.director_of, MOVIE[movie_name]))
    g.add((MOVIE[movie_name], RDF.type, MOVIE.Movie))
    g.add((MOVIE[movie_name], DC.title, Literal(movie_name)))
    g.add((MOVIE[movie_name], MOVIE.year, Literal(movie_year, datatype=XSD.year)))
</pre>
'''Question 82:'''
<pre>
from pyshacl import validate
SHACL_FILE = './movie-shacl.ttl'
# contains the rule for "A Movie must have exactly one dc:title."
# ...
sg = Graph()
sg.parse(SHACL_FILE, format='ttl')
r = validate(g,
        shacl_graph=sg,
        # ont_graph=og,
        inference='rdfs'
    )
val, rg, rep = r
print(rep)
</pre>
'''Question 83:'''
<br>''TBD.''
'''Question 84:'''
<pre>
from owlrl import DeductiveClosure, OWLRL_Semantics
ONTOLOGY_FILE = './movie-ontology.ttl'
# ...
g.parse(ONTOLOGY_FILE)
DeductiveClosure(OWLRL_Semantics).expand(g)
print(g.serialize(format='ttl'))
</pre>
'''Full program'''
For reference, not expected exam answer:
<pre>
from owlrl import DeductiveClosure, OWLRL_Semantics
from owlrl import DeductiveClosure, OWLRL_Semantics
import pandas as pd
import pandas as pd
Line 342: Line 421:
g = Graph()
g = Graph()
g.bind('', MOVIE)
g.bind('', MOVIE)
g.bind('dc', DC)
g.bind('foaf', FOAF)
load_movie_triples(g, DIRECTOR_FILE)
load_movie_triples(g, DIRECTOR_FILE)
load_lead_role_triples(g, LEAD_ROLE_FILE)
load_lead_role_triples(g, LEAD_ROLE_FILE)

Latest revision as of 12:34, 6 May 2024

Task 2: RDF and SHACL

Questions 51-54:
RDF: Add triples TBD


For questions 55-60:

*** SHACL examples - includes answers to the exam questions

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . 
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . 
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> . 
@prefix owl: <http://www.w3.org/2002/07/owl#> . 
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix : <http://info216.uib.no/movies/> .

Questions 55 and 56:

:DirectorShape a sh:NodeShape ;
    sh:targetClass :Director ;
    # A Director must have exactly one foaf:name of type xsd:string.
    sh:property [
        sh:path foaf:name ;
        sh:minCount 1 ;      # question 55
        sh:maxCount 1 ;      # question 55
        sh:type xsd:string   # question 56
    ] ;

Question 57:

    # A Director must be the director of at least one Movie.
    sh:property [
        sh:path :director_of ;
        sh:minCount 1 ;
        sh:class :Movie
    ] .

Question 58:

:ActorShape a sh:NodeShape ;
    sh:targetClass :Actor ;
    # If an actor is an actor in a resource, that resource must be a movie.
    sh:property [
        sh:path :actor_in ;
        sh:class :Movie
    ] ;

Question 59:

:ActorShape a sh:NodeShape ;
    sh:targetClass :Actor ;
    # If an actor plays a role that is a role in some resource, that resource must be a movie.
    sh:property [
        sh:path ( :plays_role :role_in ) ;
        sh:qualifiedValueShape [ sh:path :actor_in ] ;
        sh:class :Movie
    ] .

Question 60:

:MovieShape a sh:NodeShape ;
    sh:targetClass :Movie ;
    # A movie must be directed by at least one director or acted in by at least one actor.
    sh:or (
        [ sh:property [
            sh:path [ sh:inversePath :actor_in ] ;
            sh:minCount 1 ;
        ] ]
        [ sh:property [
            sh:path [ sh:inversePath :director_of ] ;
            sh:minCount 1 ;
        ] ] 
    ) .


Task 3: RDFS rules and OWL expressions

RDFS rules

Question 61: A resource that is a director_of something is a director.

:director_of rdfs:domain :Director .

Question 62: A resource that something else is a director_of is a movie.

:director_of rdfs:range :Movie .

Question 63: The year of something has type xsd:year.

:year rdfs:range xsd:year .

Question 64: An actor is a Person.

:Actor rdfs:subClassOf foaf:Person .

Question 65: A director is a person.

:Director rdfs:subClassOf foaf:Person .


OWL axioms

Question 66: Nothing can be both a person and a movie.

:Person owl:disjointWith :Movie .

Question 67: Nothing can be more than one of a person, a role, or a movie.

[] a owl:DisjointClass ;
    owl:disjointClasses ( :Person :Role :Movie ) .

Question 68: Something that plays in at least one Movie is an Actor.

:Actor rdfs:subClassOf [
    a owl:Restriction ;
    owl:onProperty :play_in ;
    owl:someValueFrom owl:Thing
]

Question 69: A LeadActor is an Actor that plays at least one LeadRole.

:LeadActor rdfs:subClassOf :Actor, [
    a owl:Restriction ;
    owl:onProperty :plays_role ;
    owl:someValueFrom :LeadRole .
] .


Task 4: SPARQL queries

Question 70: Count the number of movies that are represented in the graph.

SELECT (COUNT(?movie) AS ?count) WHERE {
    ?movie rdf:type :Movie
}

Question 71: List the titles and years of all movies.

SELECT ?title ?year WHERE {
    ?movie rdf:type :Movie ;
        dc:title ?title ;
        dc:year ?year 
}

Question 72: List the titles and years of all movies since 2000.

SELECT ?title ?year WHERE {
    ?movie rdf:type :Movie ;
        dc:title ?title ;
        dc:year ?year 
    FILTER (INTEGER(?year) >= 2000)
}

or:

SELECT ?title ?year WHERE {
    ?movie rdf:type :Movie ;
        dc:title ?title ;
        dc:year ?year 
    FILTER (?year >= "2000"^^xsd:year)
}

Question 73: List the titles and years of all movies sorted first by year, then by name.

SELECT ?title ?year WHERE {
    ?movie rdf:type :Movie ;
        dc:title ?title ;
        dc:year ?year 
}
ORDER BY ?year, ?name

Question 74: Count the number of movies for each year with more than one movie.

SELECT ?year (COUNT(?movie) AS ?count) WHERE {
    ?movie rdf:type :Movie ;
        dc:year ?year 
}
GROUP BY ?year
HAVING ?count > 1

Question 75: List the names of all persons that are both directors and actors.

SELECT ?name WHERE {
    ?person (:plays_in & :director_of) / rdf:type :Movie ;
        foaf:name ?name
}

Question 76: List the actor name and movie title for all lead roles.

SELECT ?name ?title WHERE {
    ?role rdf:type :LeadRole ;
        ^:plays_role / foaf:name ?name ;
        :role_in / dc:title ?title
}

Question 77: List all distinct pairs of actor names that have played lead roles in the same movies.

SELECT ?name1 ?name2 WHERE {
    ?movie rdf:type :Movie ;
        ^:?role_in ?role1, ?role2 .
    ?role1 rdf:type :LeadRole ;
        ^:plays_role / foaf:name ?name1 .
    ?role2 rdf:type :LeadRole ;
        ^:plays_role / foaf:name ?name2 .
    FILTER (STR(?name1) < STR(?name2))
}

Question 78-79:
SPARQL Update TBD.


Task 5: Programming

Question 80:

from rdflib import Namespace, Graph, Literal, RDF, DC, FOAF, XSD

BASE_URI = 'http://example.org/'
MOVIE = Namespace(BASE_URI)

g = Graph()
g.bind('', MOVIE)
g.bind('dc', DC)
g.bind('foaf', FOAF)


Question 81:

def add_movie_triples(g, row):
    movie = row.to_dict()
    # example dict:
    # {'Movie': 'Pulp_Fiction', 'Director': 'Quentin_Tarantino', 'Year': 1994}
    movie_name = movie['Movie']
    director_name = movie['Director']
    movie_year = movie['Year']
    # update g with a set of triples that represent the movie and its director
    g.add((MOVIE[director_name], RDF.type, MOVIE.Director))
    g.add((MOVIE[director_name], FOAF.name, Literal(director_name)))
    g.add((MOVIE[director_name], MOVIE.director_of, MOVIE[movie_name]))
    g.add((MOVIE[movie_name], RDF.type, MOVIE.Movie))
    g.add((MOVIE[movie_name], DC.title, Literal(movie_name)))
    g.add((MOVIE[movie_name], MOVIE.year, Literal(movie_year, datatype=XSD.year)))


Question 82:

from pyshacl import validate

SHACL_FILE = './movie-shacl.ttl'
# contains the rule for "A Movie must have exactly one dc:title."


# ...


sg = Graph()
sg.parse(SHACL_FILE, format='ttl')
r = validate(g,
        shacl_graph=sg,
        # ont_graph=og,
        inference='rdfs'
    )
val, rg, rep = r
print(rep)


Question 83:
TBD.

Question 84:

from owlrl import DeductiveClosure, OWLRL_Semantics

ONTOLOGY_FILE = './movie-ontology.ttl'

# ...

g.parse(ONTOLOGY_FILE)
DeductiveClosure(OWLRL_Semantics).expand(g)

print(g.serialize(format='ttl'))


Full program

For reference, not expected exam answer:

from owlrl import DeductiveClosure, OWLRL_Semantics
import pandas as pd
from pyshacl import validate
from rdflib import Namespace, Graph, Literal, RDF, DC, FOAF, XSD


ONTOLOGY_FILE = './movie-ontology.ttl'
SHACL_FILE = './movie-shacl.ttl'
DIRECTOR_FILE = './movie-director-year.csv'
LEAD_ROLE_FILE = './movie-actor-lead-role.csv'
OTHER_ROLE_FILE = './movie-actor-other-role.csv'

BASE_URI = 'http://example.org/'
MOVIE = Namespace(BASE_URI)


def add_movie_triples(g, row):
    movie = row.to_dict()
    # example dict:
    # {'Movie': 'Pulp_Fiction', 'Director': 'Quentin_Tarantino', 'Year': 1994}
    movie_name = movie['Movie']
    director_name = movie['Director']
    movie_year = movie['Year']
    # update g with a set of triples that represent the movie and its director
    g.add((MOVIE[director_name], RDF.type, MOVIE.Director))
    g.add((MOVIE[director_name], FOAF.name, Literal(director_name)))
    g.add((MOVIE[director_name], MOVIE.director_of, MOVIE[movie_name]))
    g.add((MOVIE[movie_name], RDF.type, MOVIE.Movie))
    g.add((MOVIE[movie_name], DC.title, Literal(movie_name)))
    g.add((MOVIE[movie_name], MOVIE.year, Literal(movie_year, datatype=XSD.year)))


def add_lead_role_triples(g, row):
    movie = row.to_dict()
    # example dict:
    # {'Movie': 'Pulp_Fiction', 'Director': 'Quentin_Tarantino', 'Year': 1994}
    movie_name = movie['Movie']
    actor_name = movie['Actor']
    role_name = movie_name+'-role-'+movie['LeadRole']
    # update g with a set of triples that represent the movie and its director
    g.add((MOVIE[actor_name], RDF.type, MOVIE.Actor))
    g.add((MOVIE[actor_name], FOAF.name, Literal(actor_name)))
    g.add((MOVIE[actor_name], MOVIE.actor_in, MOVIE[movie_name]))
    g.add((MOVIE[actor_name], MOVIE.plays_role, MOVIE[role_name]))
    g.add((MOVIE[role_name], RDF.type, MOVIE.LeadRole))
    g.add((MOVIE[role_name], FOAF.name, Literal(movie['LeadRole'])))
    g.add((MOVIE[role_name], MOVIE.role_in, MOVIE[movie_name]))
    g.add((MOVIE[movie_name], RDF.type, MOVIE.Movie))


def add_other_role_triples(g, row):
    movie = row.to_dict()
    # example dict:
    # {'Movie': 'Pulp_Fiction', 'Director': 'Quentin_Tarantino', 'Year': 1994}
    movie_name = movie['Movie']
    actor_name = movie['Actor']
    role_name = movie_name+'-role-'+movie['Role']
    # update g with a set of triples that represent the movie and its director
    g.add((MOVIE[actor_name], RDF.type, MOVIE.Actor))
    g.add((MOVIE[actor_name], FOAF.name, Literal(actor_name)))
    g.add((MOVIE[actor_name], MOVIE.actor_in, MOVIE[movie_name]))
    g.add((MOVIE[actor_name], MOVIE.plays_role, MOVIE[role_name]))
    g.add((MOVIE[role_name], RDF.type, MOVIE.Role))
    g.add((MOVIE[role_name], FOAF.name, Literal(movie['Role'])))
    g.add((MOVIE[role_name], MOVIE.role_in, MOVIE[movie_name]))
    g.add((MOVIE[movie_name], RDF.type, MOVIE.Movie))


def load_movie_triples(g, fn):
    df = pd.read_csv(fn)
    df.apply(lambda row: add_movie_triples(g, row), axis=1)


def load_lead_role_triples(g, fn):
    df = pd.read_csv(fn)
    df.apply(lambda row: add_lead_role_triples(g, row), axis=1)


def load_other_role_triples(g, fn):
    df = pd.read_csv(fn)
    df.apply(lambda row: add_other_role_triples(g, row), axis=1)


g = Graph()
g.bind('', MOVIE)
g.bind('dc', DC)
g.bind('foaf', FOAF)

load_movie_triples(g, DIRECTOR_FILE)
load_lead_role_triples(g, LEAD_ROLE_FILE)
load_other_role_triples(g, OTHER_ROLE_FILE)
print(g.serialize(format='ttl'))


sg = Graph()
sg.parse(SHACL_FILE, format='ttl')
r = validate(g,
        shacl_graph=sg,
        # ont_graph=og,
        inference='rdfs'
    )
val, rg, rep = r
print(rep)


g.parse(ONTOLOGY_FILE)
DeductiveClosure(OWLRL_Semantics).expand(g)

print(g.serialize(format='ttl'))