Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Computer Science 378: Introduction to Database Management

Assignment 3

Due: After Unit 8

Total: 100 marks

Weight: 15% of your final grade

Answer all questions in this assignment. Submit your completed work to your Academic Expert for marking and feedback using this Assignment Drop Box.

1.   (12 marks) Answer the following questions (250 words max/question).

a.    What are the typical integrity controls performed in both data integrity and referential integrity?

b.   Using an example for each situation, illustrate the three common situations that suggest relations should be denormalized.

c.   What are the advantages and disadvantages of horizontal and vertical partitioning?

2.   (9 marks) Answer the following questions (250 words max/question).

a.    What factors should be considered when choosing a file organization?

b.   What is the purpose of clustering data in a file?

c.    Compare hashed file organization versus indexed file organization. List two advantages of indexed over hashed and two advantages of hashed over indexed.

3.   (12 marks) Consider the following database project with these tables:

Employee (emp-no, name, department, salary),

Project (proj_no, name, location, budget, manager_emp_no)

ProjAssigned (emp-no, proj-no, worked-hours)

a.    Write a SELECT SQL query to list the numbers and names of all employees with a salary greater than 66000 who are assigned to projects, and list the project name and location they are assigned  to as well as the corresponding hours worked for each project. Your list should be sorted by

employee name.

b.   Define indexes on selected attributes to speed up your query. Justify your selections.

c.    Write SQL queries to create the indexes you defined above.

4.   (9 marks) Suggest the most appropriate security measures for each situation described below.

a.    The Western Union bank uses an electronic funds transfer (EFT) system to transmit sensitive financial data between its branches all over the world.

b.   A fighter jet simulation company has set up an off-site computer-based training centre for the

F-35. The company wishes to restrict access to the site to authorized employees. Since each

employee's use of the centre is occasional, it does not wish to provide the employees with

permanent keys to access the centre. (Assume the company stores employee information as well as assigned keys on a database.)

c.   A golf course uses a simple password system to protect its database. They created a new website to allow both its members and its employees to use the new web-based system to access and

update information. They find it needs a more comprehensive security system to grant different privileges (such as read-only versus create or update) to different users.

5.   (9 marks) Suggest an appropriate recovery technique a database administrator could use to resolve each of the following situations.

a.    A disk drive fails during regular operations while an employee is entering data about newly registered students.

b.   The registration office at a university entered an incorrect amount for a student tuition payment. The error was discovered by the financial services department several weeks later.

c.    The database administrator of a financial institution performed a full database backup but forgot to activate the journalizing facility. After, data entry clerks at the financial institution entered

transactions for two hours before the database became corrupt. It is discovered that the journalizing facility of the database has not been activated since the backup was made.

6.   (12 marks) Answer the following questions (250 words max/question).

a.    Contrast data warehouse versus data mart.

b.   Identify the major differences between the following forms of data integration: data federation and data propagation.

c.    In a diagram, present the main components of the Hadoop architecture and show their interconnections.

d.   Contrast the following types of data analytics: descriptive analytics, predictive data analytics, and prescriptive analytics.

7.   (13 marks) Consider the following dimensions, dimension attributes, and dimension sizes for Farm Coop-Insurance:

o Member(member_ID, Name, Address). On average, there are two members for each policy and item covered by the policy.

o InsuredItem(Item_ID, Description, Coverage_Type). There is an average often covered items per policy.

o CoopOffice(Office_ID, Address, Manager_name). Each policy is registered and managed by only one office of the cooperative.

o Policy(Policy_ID, Type). The company has approximately one million policies at the present time. Approximately five percent of these policies experience some change each month.

o Period(Date_Key, Fiscal_Period). The length of the fiscal period is one month. The decision system reports are supposed to be based on five years of data.

o Claim(Claim_ID, Claim_Description, Claim_Type).

The facts to be recorded for each combination of these dimensions are Policy_Premium, Deductible, and Monthly_Claim_Total.

a.    Design a star schema for this problem. (5 marks)

b.   Using the assumptions stated above, estimate the number of rows in the fact table. (4 marks)

c.    Estimate the total size of the fact table (in bytes), assuming an average of 5 bytes per field. (4 marks)

8.   (12 marks) For each of the following applications, state which type of NoSQL DBMS would fit the needs of the situation. Justify your answer.

a.    A database that needs to support a relatively complex hierarchical internal records structure that may vary for each record

b.   A database where the data is particularly well suited to be organized as a network of associations between the data items

c.   A database for which the DMS needs to provide quick access to each of the records by key value, but in addition to that, must also allow easy access to the components of each record

9.   (12 marks) For each of the following data analytics situations, state which type of analytics would address the needs of the organization. Justify your answer.

a.    National Bank of Canada, which is a midsize bank in Canada, would like to provide decision support analytics for its agents in the lending department.

b.   President’s Choice, which is a large retail chain in North America, wants to understand what factors contributed to the overall sales decline it experienced in a specific month of last year.

c.   Ubisoft, which is a video game company, wants to provide all its employees with an up-to-date graphical representation of the number of downloads of its game products and in-app revenue   associated with them.