Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Tuesday 27 April 2021

Database Theory and Application M

COMPSCI 5076

Part A: Relational Modelling & Normalisation [20 Marks]

Question 1.

Consider the relation R(A, B, C, D, E) with the following Functional Dependencies (FDs):

         FD1: {A, B, C, D} >E

         FD2: E > {B, C, D}

(a)     Identify the super key(s) of the relation R and find the candidate key. Explain your

(b)    Decompose the relation R into a set ofrelations in BCNF and identify their Primary and Foreign Keys. Explain your answer.

Question 2.

Consider the relation Employee(NumberPlate, GUID, Surname), where it stores information about employees of the University of Glasgow. The NumberPlate is a unique identifier of a car, while GUID is a unique identifier of an employee. The pair  {GUID, NumberPlate} is unique for each tuple in the relation with the following additional constraint: for an employee X (with a unique GUID) and a car Y (with a unique number plate) it holds true that: the employee  X  is  associated  with  only  one  car Y,  and  that  car Y  belongs  only to the corresponding employee X.

E.g., if the employee X with GUID ‘1234’ has the car Y with number plate ‘SC69BBZ’, then, there is neither another employee with the same car (‘SC69BBZ’) nor another car belonging to the employee X (‘1234’).

The SQL CREATE statement for the relation Employee is defined by a data analyst as:

CREATE TABLE EMPLOYEE (

Surname

GUID

VARCHAR(30) NOT NULL,

CHAR(10)    NOT NULL UNIQUE,

NUMBERPLATE CHAR(8)     NOT NULL UNIQUE,

);

(a) Based on the above-mentioned constraint, list up to three super keys, define a Primary Key (PK) for the relation Employee, and update the CREATE TABLE statement to accommodate the PK constraint. Explain your answer.

(b)     Identify the Functional Dependencies  (FDs)  in the relation.  Is the relation in BCNF? Explain your answer.

(c)     Assume that 90% ofthe employees do not have a car. In this case, identify potential challenges that can be raised due to the schema of the relation Employee and propose an alternative relational schema that can cope with these challenges. In the alternative relational schema, identify the Primary Keys, Foreign Keys (if exist), the Unique attributes (if exist) and the FDs. Do the FDs from Question 2(b) hold true in your proposed alternative relational schema? Explain your answer.

Part B: SQL [20 Marks]

Question 3. Consider the following relational schema:

Manufacturer(ID, Name)

Car(NumberPlate, Price, Year, MID)

where, a manufacturer has a unique identifier (ID) and a name, while, a car has a unique number plate, a price (in British pound sterling - £), a year of production and is associated with a manufacturer. The MID attribute in the Car relation is a foreign key referencing to the ID primary key attribute in the Manufacturer relation. A manufacturer can produce more than one car. The primary keys are underlined.

(a)     Which is the expected result of each ofthe following queries? Explain your answer.

(i)                                                                                                    SELECT * FROM CAR

WHERE  EXISTS (SELECT 0 FROM CAR);                                 (ii)                                                                                          SELECT * FROM CAR

WHERE  EXISTS (SELECT * FROM MANUFACTURER WHERE NULL IS NOT NULL); (iii)                                                                                           SELECT * FROM CAR

WHERE NOT EXISTS (SELECT * FROM CAR WHERE 1 <> NULL);

(b)    For each manufacturer, show the price of its most  expensive car(s)  using the

GROUP BY clause in your SQL query. It is possible that more than one car has the same price.

(c)     For each manufacturer, show the price of its most expensive car(s) without using the GROUP BY clause in your SQL query. It is possible that more than one car has the same price.

(d)    How many manufacturers have produced cars since 2010 (year of production)?   

Note: do not include manufacturers which have produced some cars before 2010. All the cars of a manufacturer should have been produced after 2010 (including 2010).

(e)     From each manufacturer, which has produced more than 100 cars, how many of these cars have a price greater than £50,000.

Part C: Relational Algebra & Heuristic Optimization [20 Marks] Question 4.

Consider the relations:

EMPLOYEE(ID, SURNAME, LINE_ID)

CAR(EID, MAKE)

For each employee, we store their unique ID and the ID of their line manager (if exists) LINE_ID. If an employee has a line manager then the corresponding LINE_ID is not NULL. A line manager does not have a line manager, i.e., the corresponding LINE_ID is NULL. The LINE_ID is a foreign key referencing to the ID primary key in the relation Employee. In the relation Car, we store the car(s) make of an employee, if exist(s). Specifically, for each car, we store its make (e.g., BMW, Jaguar, Aston Martin, Toyota) and the ID of its owner/employee (EID). The EID is a foreign key referencing to the ID in the relation Employee. The primary

keys are underlined.

Assume the following query:

SELECT E.SURNAME, L.SURNAME

FROM   EMPLOYEE E, EMPLOYEE L, CAR C

WHERE  E.LINE_ID = L.ID AND C.EID = L.ID

We know that: (i) there are 20 line managers, each one being the line manager of 10 employees; (ii) there are 120 employees, who have 1 car each, i.e., the CAR relation has 120 tuples, each one corresponds to different employee.

(a)     Provide the canonical tree of the query and estimate the expected number of tuples retrieved.

(b)     Optimize the query using heuristic optimization by providing the optimal tree. Show and explain the heuristic rules you used.

(c)     Provide the optimal relational algebra expression from the Question 4(b).

(d)    Estimate the expected number of tuples retrieved at each operator in your optimal tree from the Question 4(b). Compare this number with the Question 4(a).