Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Statistics 435, Statistics 711

Makeup Assignment 2

5 October 2020

1.  The file ECommerceSales.txt gives quarterly U.S. E-Commerce retail sales in millions of dollars.  The file includes a quarter variable; outlier dummies; and Shift, a dummy variable which is equal to 0 through the third quarter of 2008 and 1 thereafter.

(a)  Form separate plots for sales and the log of sales.  Describe each of these in detail, paying attention to important features.  

(b)  Fit a multiplicative decomposition model to the quarterly sales data.  Include trend and seasonal structure, the Shift variable, and outlier dummies.  Describe the fitted model and explain in detail why Shift and the outlier dummies are included in the model.

(c)  Estimate the seasonal indices.  Tabulate and plot the estimates, and discuss thoroughly.  Why does the observed seasonal pattern occur?

(d)  Perform a thorough residual analysis, including a normal quantile plot of the residuals, a plot of the residuals vs. time, a residual acf plot, and estimation of the spectral density of the residuals.  Give careful discussion of what each of these residual diagnostics reveals.

(e)  The file also contains variables labelled QuarterDays, Salesperday, and logSalesperday.  

(i)  Explain why each of these variables has been included in the data frame.

(ii)  Repeat parts (b), (c), and (d) with the response being the log of sales per day.

(iii)  Do the results for this modelling differ from those found in parts (b) and (c)?  That is, is there an improvement resulting from modelling sales per day per quarter, rather than sales per quarter?

(e)  Write a brief summary of the findings from the above analysis of the E-Commerce sales data.  [The summary should be interpretive, rather than a summary of what you did.]

2.  The Ontario gas demand series is studied on pages 52–63 in the 23 January notes.

(a)  Fit model 2 on page 55 of the 23 January notes.  Present a thorough analysis of the residuals from this model, including construction of a spectral density plot of the residuals.

Interpret all the residual plots.  Precisely what do you learn from the residual spectral plot?

(b)  Repeat part (a) for model 5 on page 62 of the 23 January notes.

(c)  How do the residual spectral plots in (a) and (b) differ?  Explain carefully.

(d)  In this part you are asked to construct model 5 for data covering the years 1960 to 1974 and forecast the year 1975.  That is, withhold the last year of data to fit the model, and then forecast.  

To do this you will first need to first augment the data frame, which has the name ontgas in the 23 January notes.  First define the variables time, fmonth, c432, s432, as in the notes.  Then use the following R command to augment the data frame:

ontgas<-data.frame(ontgas,time,fmonth,c432,s432)

The following command will fit model 5 with a fourth-degree polynomial for trend estimation to data covering the years 1960 to 1974:

model54<-lm(loggasdemand~poly(time,4)+obs125+fmonth+c348+s348+c432+s432,data=ontgas[1:180,])

In this command ontgas is now the name of the augmented  data frame and the “4” in the name of the model indicates that the trend is estimated by a fourth-degree polynomial.

Next, use this model to forecast gas demand for the year 1975.  To do this use the following command:

forecast4<-exp(predict(model54,newdata=ontgas[181:192,]))

The exponentiation places the forecasts on the demand scale, rather than the log demand scale.

Next, refit the same model with polynomial degrees 2, 3, 5, 6, 7, and 8 and form the forecasts of 1975 gas demand for each.

Finally, compare the seven forecasts of 1975 demand with both a table and a plot.  

Discuss in detail what you’ve found.  Which forecasts are best and which are worst?  What do the results suggest?