Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

COMS W4111: Introduction to Databases

Spring 2024, Sections 002/V02

Homework 1

Introduction to Core Concepts, ER Modeling, Relational Algebra, SQL

Introduction

This notebook contains Homework 1. Both Programming and Nonprogramming tracks should complete this homework.

Submission Instructions

You will submit PDF and ZIP files for this assignment. Gradescope will have two separate assignments for these.

For the PDF:

The most reliable way to save as PDF is to go to your browser's menu bar and click  File -> Print . Switch the orientation to landscape mode, and hit save.

MAKE SURE ALL YOUR WORK (CODE AND SCREENSHOTS) IS VISIBLE ON THE PDF. YOU WILL NOT GET CREDIT IF ANYTHING IS CUT OFF. Reach out for troubleshooting.

MAKE SURE YOU DON'T SUBMIT A SINGLE PAGE PDF. Your PDF should have multiple pages.

For the ZIP:

Zip a folder containing this notebook and any screenshots.

You may delete any unnecessary files, such as caches.

Add Student Information

In  [  ]:

# Print your name, uni, and track below

name = "Donald Ferguson"

uni = "dff9"

track = "Programming Track"

print(name)

print(uni)

print(track)

Setup

SQL Magic

The  sql extension was installed in HW0. Double check that if this cell doesn't work.

In  [  ]:

%load_ext sql

You may need to change the password below.

In  [  ]:

%sql mysql+pymysql://root:dbuserdbuser@localhost

In  [  ]:

%sql SELECT * FROM db_book.student WHERE ID = 12345

Python Libraries

In  [  ]:

from IPython.display import Image

import pandas

Written Questions

Chapter 1 from the recommended textbook Database System Concepts, Seventh Edition (https://codex.cs.yale.edu/avi/db-book/) covers general information   and concepts about databases and database management systems. Lecturing on the general and background information is not a good use of precious class time. To be more efficient with class time, the chapter 1 information is a reading assignment.

Answering the written questions in HW 1, Part 1 does not require purchasing the textbook and reading the chapter. The chapter 1 slides

(https://codex.cs.yale.edu/avi/db-book/slides-dir/index.html) provided by the textbook authors provide the necessary information. In some cases, students may

also have to search the web or other sources to “read” the necessary information.

When answering the written questions, do not “bloviate”. The quantity of words does not correlate with the quality of the answer. We will deduct points if you are not succinct. The answers to the questions require less than five sentences or bullet points.

If you can't explain something in a few words, try fewer.”

You may use external resources, but you should cite your sources.

W1

What is a database management system and how do relational databases organizedata?

W2

Columbia University uses several applications that use databases to run the university. Examples are SSOL and CourseWorks. An alternate approach could be letting students, faculty, administrators, etc. use shared Google Sheets to create, retrieve, update, and delete information. What are some problems with the

shared spreadsheet approach and what functions do DMBS implement to solve the problems?

W3

Explain the differences between SQL, MySQL Server and DataGrip.

W4

Crow’s Foot Notation has four endings for relationship lines. Briefly explain the meaning of each ending.

W5

What is a primary key and why is it important?

W6

The relational algebra is closed under the operators. Explain what this means and give an example.

W7

Some of the Columbia University databases/applications represent the year/semester attribute of a section in the form "2023_2". The first four characters are the academic year, and the last character is the semester (1, 2, or 3). The data type for this attribute might be CHAR(6). Using this example, explain theconcepts of domain and atomic domain. How is domain different from type?

W8

Briefly explain the difference between a database schema and database instance.

W9

Briefly explain the concepts of data definition language and data manipulation language.

W10

What is physical data independence?

Entity-Relationship Modeling

Overview

The ability to understand a general description of a requested data model and to transform into a more precise, specified logical model is one of the most important skills for using databases. SW and data engineers build applications and data models for end-users. The end-users, product managers and business managers are not SW or data modeling experts. They will express their intent in imprecise, text and words.

The users and business stakeholder often can understand and interact using a conceptual model but details like keys, foreign keys, ... are outside their scope.

In this problem, you will:

Understand a short written description of a requested data model.

Produce a conceptual data model diagram using Lucidchart.

Produce a logical data model diagram using Lucidchart.

You can sign up for a free Lucidchart account. (https://www.lucidchart.com/pages/landing) The free account provides the capabilities you will need for this course.

To draw the diagrams, you need to add the entity relationship shapes. Lecture 2 demonstrated how to add the shapes.

Adding Entity Relationship Shapes

We provide a simple Lucidchart document (https://lucid.app/lucidchart/828777b1-7b2d-4828-bedb-37b6d456c33e/edit?invitationId=inv_a142899a-7e60-44e9- b18e-335d7c9767fc) from Lecture 2 that helps you get started. You need a Lucidchart account to access the document and diagrams.

Data Model Description

The data model represents banks, customers. employees and accouts. The model has the following entity types/sets:

1. Customer

2. Employee of the banking company

3. Branch, which is a location of one of the banks offices

4. Savings Account

5. Checking Account

6. Loan

7. Portfolio

Customer has the following properties:

customerID

lastName

firstName

email

dateOfBirth

Employee has the following properties:

employeeID

lastName

firstName

jobTitle

Branch has the following properties:

branchID

zipCode

Savings Account has the following properties:

accountID

balance

interestRate

Checking Account has the following properties:

accountID

balance

Loan has the following properties.

loanID

balance

interestRate

Portfolio has the following properties:

portfolioID

createdDate

The data model has the following relationships:

Customer Branch connects a customer and a branch. A Customer is connected to exactly one Branch. A Branch may have 0, 1 or many customers.

Employee Branch connects an employee and a branch. An Employee is connected to exactly one Branch. A Branch may have 0, 1 or many associated employees.

Savings Account Branch , Checking Account Branch_, and__Loan Branch all have the same pattern.

An account/loan has exactly one branch.

A Branch many have 0, 1 or many accounts/loans.

Savings Customer , Checking Customer , Loan Customer_, and__Portfolio Customer follow the same pattern.

The account/loan has exactly one customer.

The customer may have 0 or 1 of each type of account.

A Portfolio is related to exactly one Customer_, exactly one _Savings Account_, exactly one _Checking Account_, and exactly one _Loan.

Portfolio Advisor relates a Portfolio and Employee. An Employee may be the advisor for 0, 1 or many Portfolios. A Portfolio may have at most one Employee advisor.