Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

INFS5710

Homework #3

· The total mark for this assignment is 100.

· This homework assignment accounts for 3% of the total marks for this course.

· This homework is due in your lecture class in Week 5 (either 27 March or 28 March as appropriate).

· Please refer to the document “Homework Submission Requirements” to prepare your submission.

The following three questions refer to four data sets regarding the company Orion Star: employee_master, customer, orders, and products. All of them can be found in your OrionDB folder.

Question 1 (35%)

Summer is coming! Orion Star would like to target at teenager customers. Before the company launches a marketing campaign, your manager asks you to provide the list of products whose customers are teenagers.

Write a query to list the products and for each product find the average age of the customers (when they made the orders) that falls between 13 and 19 years old. Your solution should look like the following table. Sort your data based on customer’s age in increasing order. Submit the first and last 10 rows if it has more than 20 rows.

Product Name

Customer_Avg_Age

 

xx.xx

 

xx.xx

 

xx.xx

 

xx.xx

Question 2 (35%)

Supplier Eclipse Inc is a partner of Orion Star. To encourage employees to sell Eclipse products, Orion Star is paying a commission to each employee who has sold Eclipse products. The commission is 5% of the total profit that an employee has helped the company to make from selling Eclipse products.

Write a query to list the employees and their total commissions. Your result should look like the following table. Sort your data based on Employee_ID in increasing order. Submit the first and last 10 rows if it has more than 20 rows.

Employee_ID

Commission

 

$xxxx.xx

 

$xxxx.xx

 

$xxxx.xx

 

$xxxx.xx

Question 3 (30%)

Supplier Eclipse Inc is recalling their products. Eclipse asks your assistance to provide the list of customers who bought their products delivered in 2008 and whose first three digits of product ID are 220.

Write a query to list the information of the customers who should be notified for the recall. Your result should look like the following table. Sort your data based on Customer_ID in increasing order. Submit the first and last 10 rows if it has more than 20 rows.

Customer ID

Product ID

Customer Name

Customer Address

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Homework #3 Solutions

Question 1 (35%)

 proc sql;

select p.Product_Name, mean((o.Order_Date-c.Birth_date)/365.25) as age 'Customer_Avg_Age' format=5.2

from orion.orders as o

inner join orion.customer as c

on o.Customer_ID=c.Customer_ID inner join

orion.products as p

on p.Product_ID = o.Product_ID group by p.Product_Name

having 13 <= calculated age <= 19

order by calculated age;

quit;

First 10 rows

Last 10 rows

Many of you put the condition 13 <= calculated age <= 19 in the WHERE clause and had a wrong answer. What’s the difference? FROM and WHERE tell SAS where to search data. If you put the above condition in WHERE, you are looking for the average age of the customers whose ages are between 13 and 19. This is wrong because you want SAS to find the average age of all customers.

What will happen if you put this condition in the HAVING clause. Note that HAVING comes with GROUP BY, and the GROUP BY clause is meaningful only when you use some summary function in SELECT. Image there is a big table that SAS was working on, in each row you have one product name, one customer who bought the product, and the customer’s age. Because many customers bought the same product, you use the GROUP BY clause <product name> to tell SAS for each product name, perform the summary function, which is avg(.) of customers’ ages in this case. At the end, the rows corresponding to the same product name were merged together. The resultant table has all products and their corresponding average customer ages, some may be between 13 and 19, some may not.


The HAVING clause asks SAS to display which portion of the resultant table. In this case we are only interested in the products whose average customer ages are between 13 and 19. If you still have questions, let me know.

Question 2 (35%)

 proc sql;

select e.Employee_ID, sum(o.Profit)*0.05 'Commision' format=dollar7.2

from orion.orders as o

inner join orion.products as p

on p.Product_ID = o.Product_ID inner join orion.employee_master as e

on e.Employee_ID=o.Employee_ID where p.Supplier_Name='Eclipse Inc'

group by e.Employee_ID order by e.Employee_ID; quit;

The result table (less than 20 rows) is given below.

Question 3 (30%)

 proc sql;

select c.Customer_ID, p.Product_ID, c.Customer_Name, c.Customer_Address

from orion.orders as o

inner join orion.products as p

on p.Product_ID=o.Product_ID inner join

orion.customer as c

on o.Customer_ID=c.Customer_ID

where p.Supplier_Name='Eclipse Inc' and '01Jan2008'd <= o.Delivery_Date <= '31Dec2008'd and 220 <= p.Product_ID/1000000000 < 221

order by c.Customer_ID;

quit;