Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Homework 3 | Advanced SQL

Objectives: To practice advanced SQL. To get familiar with commercial database management systems (SQL Server) and using a database management system in the cloud (SQL Azure).

Resources

●   SQL Server on Windows Azure through SQL Azure

○   SQL Server Management Studio has been installed on the CSE lab andVDI machinesif you would like to use that instead of Azure's web interface.

Before You Begin

Introduction

This homework is a continuation of HW1 and HW2, but with the following differences

●   The queries are more challenging

●   You will use a commercial database system (i.e., no more SQLite)

○   SQLite simply cannot execute these queries in any reasonable amount of time; hence, we will use SQL Server, which has one of the most advanced query      optimizers

●   You will use the Microsoft Azure cloud

Problem Set

These instructions are a subset of HW2’s.  Please review them carefully:

●   The predicates in your queries should correspond to the exact English descriptions in the question.

○    For example, if a question asks you to find flights by Alaska Airlines Inc.”, your   query should check for that specific string instead of its matching cid (eg, do not use cid=’AS’).

○    For example, if a question asks you to find flights on Tuesday”, your query        should check for that specific string instead of its corresponding did (eg, do not use did=2).

●   Return the output columns exactly as indicated.  Do not change the output column names, the output order, or return more/fewer columns.

●   Unless otherwise noted, include canceled flights in your output.

●   When asked to output specific durations, report them in minutes instead of doing the minute-to-hour conversion.

1.   (15 points)  (Output relation cardinality: 334 rows)

For each origin city, find the destination city (or cities) with the longest direct flight (by   direct flight, we mean a flight with no intermediate stops). Judge the longest flight using duration, not distance. (15 points)

Name the output columns  origin_city, dest_city, and time (the flight duration). Do not include duplicates of (origin city, destination city) pairs. Order the result by         origin_city and then dest_city (ascending, i.e. alphabetically).

2.   (15 points)  (Output relation cardinality: 109 rows)

Find all origin cities that only serve flights shorter than 3 hours. You should not include canceled flights in your determination.

Name the output column city and sort them in ascending order alphabetically. List each city only once in the result.

3.   (15 points)  (Output relation cardinality: 327 rows)

For each origin city, find the percentage of departed flights whose duration is shorter than 3 hours; canceled flights do not count as having departed.  Be careful to handle cities which do not have any flights shorter than 3 hours; you should return 0 as the  result for these cities, not NULL (which is shown as a blank cell in Azure).

Name the output columns origin_city and percentage. Order by percentage value, then city, ascending. Report percentages as percentages, not decimals (e.g., report 75.2534 rather than 0.752534). Do not round the percentages.

4.   (15 points)  (Output relation cardinality: 256 rows)

List all cities that can be reached from Seattle using exactly one stop.  In other words, the flight itinerary should use an intermediate city, but cannot be reached through a    direct flight.  Do not include Seattle as one of these destinations (even though you could get back with two flights).

Name the output column city. Order the output ascending by city.

5.   (15 points)  (Output relation cardinality: 3 or 4 rows, depending on what you consider to be the set of all cities)

List all cities that can be reached from Seattle, but which require two intermediate stops or more.  Warning: this query might take a while to execute; we will learn about how to speed this up in lecture. You can assume all cities to be the collection of all                   origin_city or all dest_city.

Name the output column city. Order the output ascending by city.

6.   (7 points)  (Output relation cardinality: 4 rows)

List the names of carriers that operate flights from Seattle to San Francisco, CA.  Return each carrier's name only once, and use a nested query to answer this question. (7         points)

Name the output column carrier. Order the output ascending by carrier.

7.   (8 points)  (Output relation cardinality: 4 rows)

Express the same query as above, but do so without using a subquery.

As before, name the output column carrier. Order the output ascending by carrier.

8.   (10 points)

The DBMS that we use in this assignment, SQLServer, is running somewhere in one of Microsoft's data centers. Comment on your experience using a cloud-hosted DBMS.     Would you recommend it to a fellow student for tinkering/experimentation?  What about recommending it to future co-workers for their project?  If your answer differed, why?

Save your answer in a file called hw3-d.txt in the submission directory.

Submission Instructions

You should submit the questions on Gradescope under HW3.  You can resubmit however many times until the due date, which will save your progress without submitting all answers.  If you     don’t have access to Gradescope or are having issues with submitting, contact course staff as   soon as possible.

For each question in the problem set, write a single SQL query or paragraph.  Each answer should be in a separate file, named hw3-q#.sql (eg, hw3-q1.sql, hw3-q2.sql, etc) or    hw3-d.txt for the reflection question.  Each .sql file should include:

●   A comment indicating the number of rows your query returns

●   A comment indicating how long the query took

●   A comment containing the first 20 rows of the result

○    If the result has fewer than 20 rows, output all of them

We want you to succeed in this class, so we strongly recommend that you verify your filenames before your final submission; our autograders hardcode the expected         filenames, and your submission will receive no points if they do not match.