CSE 414 Midterm Exam Autumn 2019
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
CSE 414 Midterm Exam
Autumn 2019
October 30, 2019
● Please read all instructions (including these) carefully.
● Write your name and UW student number below.
● This is a closed-book exam. You are allowed one page of note sheets that you can write on both sides.
● No electronic devices are allowed, including cell phones used merely as watches. Silence your cell phones and place them in your bag.
● Solutions will be graded on correctness and clarity. Each problem has a relatively simple and straightforward solution. Partial solutions will be graded for partial credit.
● Please write your answers in the boxed space provided on the exam, and clearly mark your solutions. You may use the blank sides as scratch paper. Do not use any additional scratch paper.
● Relax. You are here to learn. Good luck!
Relational algebra operators:
Union U Difference — Selection σ Projection π Join ⨝
Rename ρ Duplicate elimination δ Grouping and aggregation ɣ Sorting τ
Database constraints:
Primary key, Foreign key, Unique, Not null
By writing your name below, you certify that you have not received any unpermitted aid for this exam, and that you will not disclose the contents of the exam to anyone in the class who has not taken it.
Problem 1: True/False (10 points total)
Select either True or False for each of the following questions.
a) SQL and Datalog are declarative query languages.
b) The value of a Primary Key for a relation uniquely identifies a tuple within that relation.
c) In the relational model, the field of a tuple can itself be a relation.
d) When using the aggregate COUNT(attr), tuples with NULL values in attr field will be added to the total count.
e) In the relational model, the schema defines both the name and type of each attribute.
f) In a single table’s schema, a Primary Key cannot also be a Foreign Key.
g) The SQL we have learned this quarter cannot express recursive queries.
h) Expressing recursive queries in Datalog requires negation.
i) The following Datalog rule is safe:
Q(a, b) :- R(a, _), ! S(b, a)
j) The following Datalog rule is safe:
Q(z) :- R(a, b), S(b, c), T(c, z), ! R(a, z)
Problem 2: SQL (32 points total)
We will work with the following schema for a database of movies in this exam.
ACTOR (pid, fname, lname, gender)
MOVIE (mid, name, year, revenue)
DIRECTORS (did, fname, lname)
CASTS (pid, mid, role)
MOVIE_DIRECTORS (did, mid)
A tuple in the Casts relation represents that a person from the Actor table with pid, starred (was cast) in a Movie with mid. Movie_Directors represents a person from Directors with did, who directed a Movie with mid.
ACTOR.pid, MOVIE.mid, DIRECTOR.did = primary keys of the corresponding tables CASTS.pid = foreign key to ACTOR.pid
MOVIE_DIRECTORS.did = foreign key to DIRECTORS.did CASTS.mid, MOVIE_DIRECTORS.mid= foreign keys to MOVIE.mid
You can assume none of the tables contain NULL values. You may use subqueries for these questions. When writing your queries you can assume that the system is case-insensitive.
a) (8 points) Write a query that returns all the roles played by actors named “Kevin Bacon” (first name Kevin, last name Bacon.) Your query should return the role field. Do not include duplicates roles.
SELECT DISTINCT c.role FROM Actor a, Casts c WHERE a.pid = c.pid AND a.fname = ‘Kevin’ AND a.lname = ‘Bacon’; |
Schema repeated here for your reference:
ACTOR (pid, fname, lname, gender)
MOVIE (mid, name, year, revenue)
DIRECTORS (did, fname, lname)
CASTS (pid, mid, role)
MOVIE_DIRECTORS (did, mid)
b) (8 points) Write a query that returns the first name and last name of all actors who starred in both the movies ‘In the Mood for Love’ and ‘Chungking Express’ . Only return actors who were in both movies, not just one of the movies. Duplicate names are allowed.
SELECT a.fname, a.lname FROM Actor a, Casts c1, Casts c2, Movie m1, Movie m2 WHERE a.pid = c1.pid AND c1.mid = m1.mid AND a.pid = c2.pid AND c2.mid = m2.mid AND m1.name = ‘In the Mood for Love’ AND m2.name = ‘Chungking Express’ ; |
c) (8 points) Find the total number of movies made in each year, and return the results sorted by the number of movies, in descending order. Return the year numbers and count of movies made in that year.
SELECT m.year, COUNT(*) FROM Movie m GROUP BY m.year ORDER BY COUNT(*) DESC ; |
Schema repeated here for your reference:
ACTOR (pid, fname, lname, gender)
MOVIE (mid, name, year, revenue)
DIRECTORS (did, fname, lname)
CASTS (pid, mid, role)
MOVIE_DIRECTORS (did, mid)
d) (8 points) Write a query that returns all directors who only directed movies alone (i.e. who never directed a movie together with another director.) For any directors with no movies at all, include them in the output. Return the did field for each director.
Solution #1: SELECT d1.did FROM Directors d1 WHERE NOT EXISTS (SELECT * FROM Movie_Directors md1, Directors d2, Movie_Directors md2 WHERE d1.did = md1.did AND d2.did = md2.did AND md1.mid = md2.mid AND d1.did != d2.did); Solution #2: SELECT d.did FROM Director d WHERE d.did NOT IN ( SELECT DISTINCT d1.did FROM Director d1, Director d2, MOVIE_DIRECTORS md1, MOVIE_DIRECTORS md2 WHERE d1.did = md1.did AND d2.did = md2.did AND d1.did != d2.did AND md1.mid = md2.mid); Solution #3: SELECT DISTINCT d.did FROM Directors d LEFT OUTER JOIN Movie_Directors md ON d.did = md.did WHERE md.mid IN (SELECT md2.mid FROM Movie_Directors md2 GROUP BY md2.mid HAVING COUNT(*) = 1); Solution #4: WITH Multi_Directed_Movies AS ( SELECT mid FROM Movie_Directors GROUP BY mid HAVING COUNT(*) > 1) SELECT did FROM Directors EXCEPT SELECT did FROM Movie_Directors md, Multi_Directed_Movies mdm WHERE md.mid = mdm.mid Solution #5: withOther(d1) :- Directors(d1,_,_), Movie_Directors(d1,m), Movie_Directors(d2,m), d1 != d2. answer(d) :- Directors(d,_,_), !withOther(d). |
Problem 3: Relational Algebra (12 points total)
We will use the same schema as problem 2, repeated here for your reference:
ACTOR (pid, fname, lname, gender)
MOVIE (mid, name, year, revenue)
DIRECTORS (did, fname, lname)
CASTS (pid, mid, role)
MOVIE_DIRECTORS (did, mid)
Note: you may use aliases in the query plan to avoid renaming, e.g.
⋈x.pid=y.pid
/ \
WorksOn x Project y a) (12 points) The following query lists all directors who directed at least 200 movies, in descending order of the number of movies they directed
SELECT d.fname, d.lname, COUNT(*) AS c
FROM Directors d, Movie_Directors md
WHERE d.did = md.did
GROUP BY d.did, d.fname, d.lname
HAVING COUNT(*) >= 200
ORDER BY c DESC;
Write a Relational Algebra expression in the form of a logical query plan (you may draw a tree)
that is equivalent to the above SQL query.
|
p. 7
Problem 4: Entity Relationship Diagrams (9 points total)
For this problem consider a database of musical artists, albums, and songs. Artists can be individuals like Adele, or bands like The Beatles. We have three entities and relationships between them:
● A Song is created by a single Artist
● Songs are published on albums
● An Album is released by a single Artist
● Is_On is a many-to-many relationship. Albums can have multiple songs, and songs can be on multiple albums (The Essential Beatles, The Beatles’ Greatest Hits, etc.) An Album must have at least one song.
● Creates is a many-to-exactly-one relationship from Song to Artist.
● Similarly, Releases is a many-to-exactly-one relationship from Album to Artist.
p. 8
(a) Assume we are trying to minimize the number of tables we use to represent the above entities.
Fill in the create table statements for the Song and Album relations, making sure to preserve the constraints in the diagram above and trying to minimize the number of tables we will need in total. The underlined attributes above are the primary keys of the respective entity sets. The types of arid, sid, and alid are INT, and all other attributes are VARCHAR(100).
(8 points)
CREATE TABLE Song ( sid INT PRIMARY KEY, name VARCHAR(100), arid INT NOT NULL, FOREIGN KEY arid REFERENCES Artist ); CREATE TABLE Album ( alid INT PRIMARY KEY, name VARCHAR(100), label VARCHAR(100), arid INT NOT NULL, FOREIGN KEY arid REFERENCES Artist
); |
(b) Do we need a table for the “Is_On” relationship? Write a sentence explaining why/why not. (1 point)
Yes, we need a table for Is_On to store all the tuples in a many-to-many relationship. |
Problem 5: Functional Dependencies (12 points total)
Assume we have a relation R and its attributes in parentheses, and we know that it satisfies the following functional dependencies:
R ( A, B, C, D, E )
A - > BC
D - > E
B - > E
CD - > A
For the functional dependencies below, you will decide if we can derive them from the information above. Circle YES if we can guarantee that R satisfies the functional dependency given the knowledge we have. If not, circle NO. (2 points each)
a)
A - > A
YES NO
b)
A - > AD
YES NO
c)
d)
e)
f)
AD - > A
2023-06-01