关键词 > BU.450.740

BU.450.740 Retail Analytics Homework 5

发布时间:2023-02-27

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

BU.450.740 Retail Analytics

Homework 5

Instructions

•   The assignment is due at the beginning of session 6.

•   Submission by email or hardcopy is not accepted. Please upload your answers on Canvas.

•   The format can be either MS Word or PDF. In case you have trouble converting html to word/pdf, please copy and paste the scanned images onto Word or PDF document.

•   For R coding questions, please include your reasoning and explanations for each  question.  Please do not attach R codes (& their outputs) only. As in homework 4, please use rmarkdown or similar packages to produce the report for item 1.

•   You can collaborate and discuss with your colleagues within or outside your assigned group. However, you will be submitting your own write-up.

•   Late submission will not be accepted and receive 20% deduction per day.

Item 1: Analyzing consumers’ purchase decisions using tree-based methods in R

In this exercise you will be using the data we used in the session 5’s in-class exercise: “dec_data.csv.”

Questions: 20 points in total

1.   Generate training and testing data [2 points]

a.   Please construct a training data set and a testing data set by using 80% and 20% of the master data set (“dec_data.csv”), respectively. To ensure the    replicability of the exercise, please set the random seed to 1.

2.   Fit classification tree to predict “Buy” [3 points each]

a.   Fit the classification tree to the training data using “rpart” as we did in session 5’s in-class exercise. Again, to ensure the replicability ofthe   exercise, please set the random seed to 1.

b.   Plot the fitted tree using prp” and discuss the findings from the tree.

c.   Construct the confusion matrix using the fitted outcome and the testing data. Calculate the accuracy, precision, and recall.

3.   Fit random forest to predict Buy” [3 points each]

a.   Fit random forest to the training data using “randomforest” as we did in session 5’s in-class exercise. To ensure the replicability of the exercise, please set the random seed to 1.

b.   Construct the confusion matrix using the fitted outcome and the testing data. Calculate the accuracy, precision, and recall.

c.   Would you be able to improve the performance by choosing different parameters for mtry” (Number of variables randomly sampled as candidates at each split) and “ntree” (The number of trees grown)?    Explore and discuss what you find.

Item 2: Planning a Merchandising Strategy in ArcGIS.

In item 2, you will be conducting exercise of exploring demographics and competition in Chicago. To answer those questions below, you will be replicating what we cover in session 5. While reviewing the session 5, you may find those steps covered in the chapter 4 pp.94- 106 of the reference book (Chapter 4 of Miller 2007 “GIS Tutorial for Marketing1”) helpful.

Questions: 20 points in total

1.   Briefly compare the distribution of families, income levels, and home values in the neighborhood of those two Meiers stores. [5 points]

2.   Analyze the location of these two Meiers Home Furnishings stores in terms of:

a.   Shopping centers

b.   Competing furniture retailers

c.   Major roads

Note: When answering this question, please create three separate maps:

1. A map showing the influence from nearby shopping centers [Partially done in session 5– see earlier slide]

2. A map showing the influence from competing furniture stores

3. A map showing the impact from the type of roads

[10 points]

3.   What are your overall recommendations for the poorly performing Pulaski” store? Discuss briefly. [5 points]