For this coursework, you have to compile a brief report (maximum 1500 words including graphs, tables and references). All data analysis should be undertaken in EXCEL and should make use of methods taught on the Quantitative Methods for Finance and Business (832L1) module.

The EXCEL datasets for this project are available in the spreadsheet file ProjectData.xlsx, which is on the Canvas site for this module.  

Below is a description of each dataset within the above spreadsheet.  

1. Walmart Sales

Walmart is one of the leading retail chain stores in the U.S. This dataset contains weekly sales data for 45 Walmart stores covering the period 05/02/2010 to 26/10/2012 in addition to a selection of other variables. The names and descriptions of variables in the dataset are as follows:

(i) Store:  the store number

(ii) Date: the week of the sales

(iii) Weekly_Sales: the value of weekly sales for the given store, in USD.

(iv) Holiday_Flag: = 1 if the week is a special holiday week; = 0 if it is a non-holiday week.  

The following holiday events took place.

Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13

Labor Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13

Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13

Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13

(v) Temperature: the average temperature for the week, in degrees Fahrenheit  

(vi) Fuel_Price: the weekly cost of fuel per gallon in the region

(vii) CPI: the prevailing consumer price index  

(viii) Unemployment: the prevailing unemployment rate in the region

 2. Insurance Costs

Private health insurance is a big industry in the United States. The dataset below provides data on the medical costs (in USD) incurred by a large health insurance company (CPD Health Insurance) in covering the health costs of 1338 of its beneficiaries.  

The names and descriptions of variables in the dataset are as follows:

(i) Age: age of beneficiary in years

(ii)  Sex: gender of beneficiary (= 1 if male; = 0 if female).

(iii) BMI: Body mass index of the beneficiary

(iv) Children: The number of children the beneficiary has

(v) Smoker: = 1 if the beneficiary is a smoker; 0 if the beneficiary does not smoke

(vi) Region: the beneficiary's residential area in the U.S. (northeast, southeast, southwest, northwest)

(vii) Charges: Medical costs covered by the company CPD Health Insurance (in USD)

3. Customer Spending

Businesses can better understand their customers and modify their products according to the specific needs and behaviours of different types of customers. The dataset below provides data on the characteristics and spending of 2240 customers.

The names and descriptions of variables in the dataset are as follows:

(i) ID: customer’s unique identifier

(ii) Year_Birth: Customer's birth year

(iii) Marital_Status: = 0 if single or divorces; 1 if married or cohabiting

(iv) Income: customer’s yearly household income in USD

(v) Kidhome: number of children in customer's household

(vi) Teenhome: number of teenagers in customer's household

(vii) Recency: number of days since customer's last purchase

(viii) MntWines: amount spent on wine

(ix) MntFruits: amount spent on fruits

(x) MntMeatProducts: amount spent on meat products

(xi) MntGoldproducts: amount spent on gold products

(xii) NumWebPurchases: number of purchases from the website

(xiii) NumStorePurchases: number of in-store purchases


Assume you are working for a company and are asked to provide a short report summarising the sales performance of Walmart or the medical costs incurred by CPD Health Insurance or customer spending.

***Each student has been assigned a dataset. You are required to write the report using only the dataset you have been assigned. The use of a dataset to which you have not been assigned will result in a mark of 0.***

The recipient wants a report that is concise, clear and covers the main issues accurately.  The project should be treated as much as an exercise in report writing as an exercise in statistical analysis. The ability to distil the information into plain and simple language, accessible to a layman with limited knowledge of statistical techniques, will be rewarded. You should pay close attention to structure and to ensuring that the main results from the exercise are clearly reported and explained. A report that neglects the issues and applies the statistical techniques in a mechanical manner will be penalised. You therefore need to think clearly about the issues that you are examining and outline  your hypotheses and empirical findings carefully. It is up to you to decide what the data tell you, the specific research questions you wish to address, and what the report should contain. 

Suggested structure of the report

You do not have to adhere to the following structure precisely but it provides a guideline you may wish to adopt.  However, you may decide to use an alternative structure more suited to your analysis.

(i) Discussion of the research question under investigation

(ii) Descriptive analysis of the data  

(iii) Hypothesis tests, interpretation of results and a discussion of the implications of findings.

(iv) Estimation of a suitable model using regression analysis and testing of relevant propositions and hypotheses.

(v) Conclusions and limitations (if any). 

Collaboration and group work are NOT permitted on this project. 

Submission details can be found on the Canvas site for this module. This assignment is submitted through Turnitin

There are penalties for late submission, which are set out in the Handbook for Candidates.

Please note:

• Skills Hub provides resources about writing skills. Please visit the following link: http://www.sussex.ac.uk/skillshub/

• It is your responsibility to ensure you submit the correct work for your assignment: please check carefully that any files are your final work before you submit them.

• In making a submission, you are declaring that your work contains no examples of academic misconduct, such as plagiarism, collusion or fabrication of results. Please visit the following link for University resources:


• You can submit and resubmit work up to your deadline

• After your deadline has passed you will have a further 7 days to submit work, provided you have not already made a submission. This late submission will normally incur a penalty. You will not be able to withdraw and resubmit work during this period, even if you have been granted a penalty waiver.