Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

COMM5007 Coding for Business (T1 2023)

Code-based Solution / Capstone Project

1. Overview

Marketing in the Fast-Moving Consumer Goods (FMCG) sector is crucial for the  success of companies operating in this field. FMCG products are those that are sold quickly and at relatively low cost, such as food, beverages, toiletries, and cleaning     products. Companies in this sector must be efficient in their marketing strategies in   order to remain competitive in an ever-changing market. For supermarkets like Coles and Woolworths, they need to understand the trends in the FMCG sector and always provide products that are attractive to customers. This is to satisfy their customer’s    changing needs and preferences, keep up with the increasing competition in the market or to capitalise on new opportunities.

Your company WSNU has had a data partnership with a large brand for the past few years who provide transactional and customer data (As you can see in  purchaseBehaviour.csv and transactions.csv). You need to present a strategic recommendation to your client that is supported by data so that the management team can make a decision. The client is particularly interested in customer segments and their chip purchasing behaviour.

2. Key Dates

What?

When?

Assignment Due Milestone 1

Week 7 Monday, 27th March 2023, 3:00 pm     (Sydney Time) (Submit both the written report

and Python code via Moodle)

Assignment Due Milestone 2

Week 11 Monday, 24th April 2023, 3:00 pm (Sydney Time) (Submit the written report,

Python code and video pitch via Moodle)

3. Milestone 1: Data Preparation and Visualisation (20 marks)

3.1 Data Preparation (7 marks)

This task requires you to analyse your client's transaction data, perform data cleansing, and create visualisations. The steps to be followed are:

1.  Inspect the transaction data for null or missing values and handle them using one of the methods covered in class (1 mark).

2.  Identify and correct the erroneous column in the transaction data by converting it to the appropriate data type (1 mark).

3.  Verify the correctness of all the products by identifying and categorizing items across all tables in the transaction data (1 marks).

4.  Check for outliers in the transaction data using the describe() function and handle them appropriately while explaining their existence to your team (1 marks).

5.  Prepare the data for plotting: This may involve converting columns into the desired format, creating new columns, and transforming the data into a     suitable format for plotting. Your manager wants:

•   A new column called "Package_SIZE", for example 380g (1 mark) .

•   A "Brand_Name" column that displays the first word of the "PROD_NAME" column. For example, "Smiths Chip Thinly S/Cream&Onion 175g" will be   displayed as "Smiths" (1 mark).

•   To combine similar brand names, such as "RED" and "RRD", which are both Red Rock Deli chips, into one brand name (1 mark).

Remark: Each sub-task in Section 3.1 worths 1 mark. If each sub-task works as

expected, then grant 1 mark; otherwise 0.

3.2 Data Visualisation (10 marks)

After preparing the data, you can use any plotting functions of your choice such as    scatter(), hist(), etc. to visualise the data. You only need to create FIVE different plots.

You will be rewarded marks based on the quality of your analysis and plots (you may use the same plot type if desired). Consider plotting the following questions:

1.  The relationship between the total sales column and the date column to understand the sales trend over time.

2.  Daily transactions during December, as it is the most important month of the year.

3.  Transactions by brand name, as obtained from Step 3.1.5.

4.  Transactions by package size, as obtained from Step 3.1.5.

5.  A pie chart of the premium customer column in the behaviour data frame.

6.  The distribution of budget, mainstream, and premium customers in relation to their life stage.

7.  Any other interesting relationships that you may discover, for example:

a.  Which are the top brands?

b.  Which are the most popular products?

c.  Etc.

Remark: Each visualisation in Section 3.2 worths 2 marks. The breakdown of

the 2 marks is as follows:

  Clearness of the plot  1 mark;

•   Title  0.5 marks;

•   Other elements such as axis labels, legends, etc. which depend on the

type of plot  0.5 marks.

3.3 Mini Report (3 marks)

You also need to write a mini report to explain and highlight the things you have

done in Milestone 1. Word limit of the mini report is 700 words, excluding UNSW

coversheet, table of contents, reference (using the UNSWHarvard Referencing

standard), and Python code.

Following sections are suggested in the report:

1. Introduction (Data quality, Data cleansing, etc.) (approx. 200 words).

2. Data plots, key highlights and your observations (approx. 450 words) .

3. Issue(s)/limitation(s) in the dataset (approx. 100 words).

3.4 Milestone 1 Submission

Please submit the following two files through Turnitin on Moodle.

1.  Jupyter Notebook: The Jupyter Notebook named as zID_Milestone1.ipynb (e.g., z1234567_Milestone1.ipynb) contains all your Python code in Sections 3.1 and 3.2. Please make sure that all Python code can run without

errors/bugs.

2.  Mini Report: The mini report should be named as zID_Milestone1.pdf (e.g., z1234567_Milestone1.pdf). Submit your report with a signed coversheet      (typed signatures are allowed because of COVID) of all group members.     Failure to include the UNSW coversheet with signatures will lead to 5%       penalty of the awarded marks, and no marks will be released until the          coversheet is received.

4. Milestone 2: Data Modelling, Analytics, and Reporting (20 marks)

4.1 Data Modelling and Machine Learning (5 marks)

Now that the data is ready for analysis, you want to explore the relations between   some of the driving factors using machine learning models. Once again, this is an   open-ended part, and you can explore different themes. But as a general reference, you need to provide the following information.

•   You first need to merge transaction dataframe and purchase behaviour dataframe by using loyalty card number;

•   split the data into a training set and testing set to build your model;

•   use linear regression or logistic regression to assess and predict values (we have provided some advanced examples in Week 8 Ed lesson, and you are free to use them); and

•   use confusion matrix, statsmodels.api or/and ROC_Curve to assess your model.

We might want to target customer segments that contribute the most to sales to retain them or further increase sales. For example, Mainstream - young singles/couples. You can develop simple research question such as:

•   Is life stage a contributing factor to the sales? What kind of relationship do they have?

•   Is price a contributing factor to the sales of a brand? What kind of relationship do they have?

Some challenging questions are:

•   The customer's total spends over the period and total spend for each            transaction to understand what proportion of their grocery spend is on chips.

•   Proportion of customers in each customer segment overall to compare against the mix of customers who purchase chips.

4.2 Report for Milestone 2 (13 marks)

You also need to write a report to explain and highlight the things you have done in

Milestone 2. Word limit of the report is 1500 words, excluding UNSW coversheet,

table of contents, reference (using the UNSWHarvard Referencingstandard), and

Python code.

Following sections are suggested in the report:

1. Introduction (Background of the project, Motivation of the project, etc.) (approx.

200 words).

2. Problem definition and analysis (Research question, Driving factors, Data modelling, Assumptions about the data modelling, Data analysis, etc.) (approx. 800 words).

3. Conclusions and Discussion (Recommendations, Management insights, Limitations of your data modelling, Limitations of the data set, etc.) (approx. 350 words).

4.3 Video Pitch for Milestone 2 (2 marks)

You also need to make a video pitch (max. two minutes in .MP4 format) to briefly  introduce your research question, data modelling and your solution(s). At most 3 slides, excluding the cover page/opening slide that shows your name and the title of your project and the reference list slide, in your presentation.

4.4 Milestone 2 Submission

Please submit the following three files through Turnitin on Moodle.

1.  Jupyter Notebook: The Jupyter Notebook named as zID_Milestone2.ipynb (e.g., z1234567_Milestone2.ipynb) contains all your Python code in Section 4.1. Please make sure that all Python code can run without errors/bugs.

2.  Report: The report should be named as zID_Milestone2.pdf (e.g., z1234567_Milestone2.pdf). Submit your report with a signed coversheet (typed signatures are allowed because of COVID) of all group members. Failure to include the UNSW coversheet with signatures will lead to 5%  penalty of the awarded marks, and no marks will be released until the    coversheet is received.

3.  Video Pitch: The video should be named as zID_Milestone2.mp4 (e.g., z1234567_Milestone2.mp4). Remark: If you submit the video in other format such as .mov, you will lose all 2 marks.

5. General Rules

Proper Academic Conduct

All assignments need to follow UNSW’s guidelines regarding proper academic conduct. The submission of materials that are non-original or have been submitted   elsewhere will be considered plagiarism. Plagiarism is unacceptable. All instances of plagiarism or other academic misconduct will be pursued. Plagiarism may lead   to you failing this course and may have negative consequences for your studies at UNSW. The general UNSW guideline on academic conduct is available   online.

Assignment Submission

Assignments are to be submitted via Moodle on, or better before, the due date. Late  submission of assignments is not desirable, disrupt the course timelines and are a     sign of poor time management and will lead to reduced marks. The late submission of assignments carries a penalty of 5% of the awarded marks for that assignment per day of lateness (including weekends and holidays). For example, a 70 marking would be reduced by 3.5 marks per day of lateness.

An extension of time to complete an assignment may be granted by submitting a Special Consideration in the case of illness or misadventure. Even if an extension is granted, parts of the marks that are dependent on a timely submission and timely    progression of the course cannot be achieved at all. The general UNSW guidelines for special considerations are available online.