Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECON215 Analysis of Big Data: programming, data management, and visualisation

Lab 6

Semester 1

Lab 6 focuses on SQL, with exercises on JOIN and GROUP BY separately in Task 1 and then together in Task 2. Task 2 also includes some practice with the HAVING clause, which filters results that have been aggregated using GROUP BY.

Task 1: SQL exercises on JOIN and GROUP BY separately

(i) : Using the Customers and Orders tables, and without using aliases, return a table of orders with their corresponding customer names.

(ii) : Using the Customers and Orders tables, and without using aliases, return a table of orders with their corresponding customer names, ordering the results by the order ID.

(iii) : Using the Customers and Orders tables, return a table of orders with their corresponding customer names, but this time with the customer names being in the first column, and with the results ordered by customer name.

Moreover, the query should use the aliases ”C” and ”O” for the Customers and Orders tables, respectively.

(iv) : Using the Customers and Orders tables, can you find out the name of the customer who/that placed the first ever order? (NOTE: due to the presence of NULL entries, where two customers have not yet placed an order, a ”SELECT TOP 1” query will not work - instead, just try to produce a query where the answer is easily seen in the results table).

(v) : Using the Orders and OrderDetails tables, write a query to return the following columns: OrderID, OrderDetailID, CustomerID, ProductID, Quantity, and UnitPrice.

Moreover, the aliases ”o” and ”od” should be used for the Orders and OrderDetails tables, and the results should be ordered by OrderID.

Note that each OrderID includes multiple products, each with its own ProductID, which is why the OrderDetails table needs to have its own primary key ”OrderDetailID”.

(vi) : Using the Employees and Orders tables, write a query to return the following two columns: EmployeeID and OrderID.

Moreover, the results should be ordered by EmployeeID, and the aliases ”e” and ”o” should be used for the Employees and Orders tables, respectively.

(vii) : Using the OrderStatus and Orders tables, return the records for the order status and order ID. In particular, there should be the following two columns: StatusName (from the Or-derStatus table) and OrderID.

Moreover, the results should be ordered by the order status (StatusName), and the aliases ”os” and ”o” should be used for the OrderStatus and Orders tables, respectively.

(viii) : Using the Orders table, return a table of results for the total number of orders placed by each customer, displaying CustomerID and the corresponding number of orders.

(ix) : By building on the query in (viii), can you return the customer ID and total orders for the customer who made the most orders?

Task 2: SQL exercises on JOIN and GROUP BY together, and sometimes also with HAVING

(i) : Using the Customers and Orders tables, can you return a table with two columns: Cus-tomerName, and the total number of orders made by that customer? Why might the results be a little misleading?

(ii) : Using the Orders and OrderDetails tables, return results for the total quantity of products ordered for each customer (counting across all the different products they ordered), displaying this for each customer ID.

There should be the following two columns: CustomerID, and the sum of quantities ordered.

Moreover, the aliases ”o” and ”od” should be used for the Orders and OrderDetails tables.

(iii)* : Building on the query in (ii), how could you ensure that only customers who have placed orders are included in the result?

(iv) : Using the Employees and Orders table, return the total number of orders placed by each (Northwind Traders) employee. There should be the following two columns: EmployeeID and the number of orders.

Moreover, the aliases ”e” and ”o” should be used for the Employees and Orders tables.

(iv) : Using the OrderStatus and Orders tables, return a table with the following two columns: StatusName (from the OrderStatus table) and the number of orders with that status.

Moreover, the aliases ”o” and ”os” should be used for the Orders and OrderStatus tables.

(v) : Building on the query in (iv), can you add a line so that only the number of orders with status ’New’ are returned?

(vi) : Building on the query in (v), can you return the number of orders with status either ’Paid’ or ’Invoiced’?

(vii) : Using the Products and OrderDetails tables, return two columns: ProductName and the total sales value (UnitPrice * Quantity for each product).

Moreover, the aliases ”p” and ”od” should be used for the Products and OrderDetails tables, respectively.

Hint: SUM(od.UnitPrice * od.Quantity)

(viii)* : Building on the query in (v), how could you ensure that you only include products that have been ordered?