BAN210 – Final Assessment
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
BAN210 – Final Assessment
The Final Assessment is available on Blackboard from December 3th, 2021. The deadline to submit the Final Assessment is December 13th, 2021 at 12:00pm.
Please note that a late penalty of 10% per day will be applied should the deadline not be met.
I will be available to answer questions via email and Microsoft Teams from December 3th, 2021 to December 13th, 2021 at 12:00pm.
Should you have technical issues, please contact the ITS service desk at
[email protected] , or login to our Technician’s Microsoft Teams site (available 08:00 AM EST – 05:00 PM EST.)
The Final Assessment is worth 20% of your final grade.
Academic Integrity
The Final Assessment must be completed individually.
By submitting the Final Assessment, I affirm that I will not give or receive any unauthorized help and that all work provided will be my own. I agree to abide by Seneca’s Academic Integrity Policy and I understand that any violation of academic integrity will be subject to the penalties outlined in the policy.
Instructions
Your task is to design and implement necessary components to get the job done.
When needed, refer to the workshops for relevant instructions and ideas, or research methods in the SAS Enterprise Miner book or online.
Unlike the workshops, your task is NOT to simply paste results from Enterprise Miner, you must also EXPLAIN that which the results indicate, the reason why they are important or nontrivial, and the way they can be used as a PA-based insight.
Please refer to the attached rubric for a clear understanding of the manner in which the Final Assessment will be graded.
Unlike the workshops, the instructions in the final assessment are minimal (this is an assessment!). This is NOT a group work.
Deliverables
1) Start an MS Word document for your report. Add the following declaration to your file:
“I, ------------ (mention your name), declare that the attached assignment is my own work in accordance with the Seneca Academic Policy. I have not copied any part of this assignment, manually or electronically, from any other source including web sites, unless specified as references. I have not distributed my work to other students.”
2) Download the csv files from Blackboard.
3) Start a new SAS Enterprise Miner project named Final_yourname.
4) Your report should be no more than 10 pages in length and include the following:
a. The goal and summary of your work on the data set (transactions or classification or both!)
b. Your choice of the best performing models.
c. The findings (patterns, graphs, model evaluation….) and explanation of the business decision based on your data analysis.
d. After implementation, assessment and selection of finalized models, during the last stage of this project, you might evaluate models against test sets. In your documentations, you will show your understanding of the predictive methods, their strength, and their limitations.
5) Prepare a narrated Powerpoint presentation and explain your findings in no more than
12 slides, with narration not exceeding 10 minutes . Please note the following:
a. The outlines and conclusions must be clear.
b. Use your findings from graphs and short note format.
c. The narration should lead the audience to the conclusion.
6) SAS Enterprise Miner files (zipped). DO NOT submit the full SAS project. The zipped file includes the outputs such as logs, graphs, txt, etc.
Datasets
In this assessment, you will be implementing, testing, and documenting Predictive Analytics models for one of three data sets:
- Dataset 1: A dataset of diamonds
o diamonds:
No: Number of the diamonds
Cut: quality of the cut
Color
Clarity: measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))
Depth: total depth percentage
Table:https://www.brilliance.com/education/diamonds/depth-table
Price
Dimensions: x,y,z
o New diamond:
No: Number of the diamonds
Cut: quality of the cut
Cut_ord
Color
Clarity
Clarity_ord
- Project Overview
A jewelry company wants to put in a bid to purchase a large set of diamonds (new- diamonds.csv) but is unsure how much it should bid. In this project, results from a predictive model to make recommendation on how much the jewelry company should bid for the diamonds.
- Dataset 2: A dataset of Diabetes
o Test_diabetes:
Pregnancies
Glucose
Blood Pressure
Skin Thickness
Insulin
BMI
DiabetesPedigreeFunction
Age
Outcome
- Project Overview
The purpose of this dataset is to predict a person does or does not have diabetes. Validate the predictions with appropriate metrics. The target column is ‘outcome’. Try to find out what factors could increase the risk of diabetes
- Dataset 3: A Sample Superstore
o Sample-Superstore:
Order ID, Order Date, Ship Date, Ship Mode
Customer ID, Customer Name, Segment
Country, City, State, Postal Code, Region
Product ID, Category, Sub-Category, Product Name
Sale
Discount
Quantity
Profit
- Project Overview
The purpose of this sample dataset is to evaluate the annual profit of superstore. Derive a meaningful review segmentation. You perform extensive data analysis to deliver insights on how the company can increase its profits while minimizing the losses. Try to find out the weak areas where you can work to make more profit.
Exploring data
Exploratory Data Analysis refers to the critical process of performing initial investigations on data to discover patterns, spot anomalies, test hypothesis and check assumptions with the help of summary statistics and graphical representations. It is a good practice to understand the data first and try to gather as many insights as possible from it . To show the data:
- Summarize the evidence and identify interesting patterns while eliminating ideas that likely won’t pan out.
- Identify relationships or association between variables that are particularly interesting or unexpected.
- Validate problems with the collected data, such as Outliers, Missing data or Data capture error.
Descriptive Analysis
The following methods are used by retailers to increase sales by better understanding
customer purchasing patterns. If it applies to your dataset, use at least one of the methods below
o Market Basket Analysis
- Resource:
https://documentation.sas.com/?docsetId=emref&docsetTarget=n1igmnwpnqnzcgn1wjfi8qu56 03p.htm&docsetVersion=15.1&locale=en
o Association Rules
- Resource:
- https://documentation.sas.com/?docsetId=emref&docsetTarget=n16x97j506upgin1l90wrfc1rg0 l.htm&docsetVersion=15.1&locale=en
o Clustering
- Resource:
- https://documentation.sas.com/?docsetId=emref&docsetTarget=n1vjatb74dundbn12d2ecb09ju ak.htm&docsetVersion=15.1&locale=en
Predictive models
- Partition the data into 80% training and 20% validation.
- Explain how using validation set helps to avoid overfitting/underfitting
- If it is necessary apply feature engineering.
- At least build two models using decision tree, logistic regression, linear regression, neural network, knn, and so on.
- Assess, analyze, and compare the performance of your models
2021-12-14