Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STAT603: Forecasting

Assignment 1

The purpose of this assignment is to assess your analytical and computing skills on the material covered.

Total Possible Marks: 60 marks, which contribute 30% towards your final grade in this paper.

Deadline: 11:59pm, Monday, September 19, 2022

Submission: The assignment must be submitted as a soft copy in a single .pdf file on Canvas. Your filename must include 1) your lastname, 2) your firstname, and 3) your student id, e.g., if John White submits his assignment, his .pdf file must be named ”White_John_123456789”.

Report/Assignment: Your assignment must be self-contained, i.e., you need to embed your R code in your answers. See example in the box below:

oildata <- window(oil, start=1996)

autoplot(oildata) +

    ylab("Oil (millions of tonnes)") + xlab("Year")

Page Limit: Maximum number of pages is 15 including graphs and R code.

Data for Questions 3-4:

• Quarterly total beer available for consumption (million litres) in New Zealand from Quarter 1, 2010 to Quarter 4, 2019

(Filename: NZ_TotalBeer_Quarterly.xlsx)

• Quarterly average nation-wide temperature (degrees celcius) in New Zealand from Quarter 1, 2010 to Quarter 4, 2019

(Filename: NZ_AvgTemp_Quarterly.xlsx)

• Quarterly real national disposable income (Billion NZ dollars) in New Zealand from Quarter 1, 2010 to Quarter 3, 2019

(Filename: NZ_DispIncome_Quarterly.xlsx)

Note: All data should be converted into time series using tsibble function in R.

R: All computing tasks must be done using R or RStudio.

Plagiarism: If this is the case for your assignment, your case will be referred to an appropriate university's office.

Tasks/Questions:

1. Use the dataset household wealth (hh_budget) to perform the following tasks or answer the questions. (12 marks)

(a) Create a training set by withholding the last four years as a test set. (2 marks)

(b) Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set. (4 marks)

(c) Compute the accuracy of your forecasts. Which method does best? (3 marks)

(d) Do the residuals from the best method resemble white noise? (3 marks)

2. Dataset tourism contains quarterly visitor nights (in thousands) from 1998 to 2017 for 76 regions of Australia. (12 marks)

(a) Extract data from the Gold Coast region using filter。and aggregate total overnight trips (sum over Purpose) using summarise。. Call this new dataset gc_tourism. (3 marks)

(b) Using slice() or filter。,create three training sets for this data excluding the last 1, 2 and 3 years. For example, gc_train_1 <- gc_tourism %>% slice(1:(n()-4)). (3 marks)

(c) Compute one year of forecasts for each training set using the seasonal naive (SNAIVE()) method. Call these gc_fc_1, gc_fc_2, and gc_fc_3, respectively. (3 marks)

(d) Use accuracy。to compare the test set forecast accuracy using MAPE. Comment on these. (3 marks)

3. Use the New Zealand quarterly total beer available for consumption data described above. (20 marks)

(a) Plot the series and discuss the main features of the data. (3 marks)

(b) Discuss whether a transformation is needed. If yes, do so and describe the effect. (3 marks)

(c) Find and discuss whether the autocorrelation exists in this time series. (3 marks)

(d) Compute two years of forecasts (i.e. holding the last two years of data out as the test set) using the four methods: (1) mean, (2) naive, (3) seasonal naive, and (4) drift. Plot the series and the forecasts, and discuss the results. (8 marks)

(e) Compare the root mean squared error (RMSE) of forecasts from the four methods in (d). Which method do you think is best for this time series? (3 marks)

4. Time series regression models (16 marks)

(a) Fit a regression model to the quarterly total beer available for consumption data with a linear trend and seasonal dummies. Discuss the results. (3 marks)

(b) Plot the quarterly total beer available for consumption data with the quarterly average nation-wide temperature and real national disposable income data. Perform the correlation analysis and discuss the results. (3 marks)

(c) Fit a regression model to the quarterly total beer available for consumption data with the quarterly average nation-wide temperature and real national disposable income data as the explanatory variables. Discuss the results. (3 marks)

(d) Do we need to include the linear trend and seasonal dummies in the regression model in (c)? Perform a relevant analysis and discuss the results. (4 marks)

(e) Compute two year of forecasts for the regression models in (a) and (c). Evaluate the forecast accuracy and compare with those in Question 3 parts (d)-(e). (3 marks)