STATS 786: Solutions
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
STATS 786: Solutions
Department of Statistics
Consider the given model:
yt = β0 + β1t2 + yt-12 + εt ,
If we apply both seasonal and ﬁrst diﬀerencing:
(1 - B)(1 - B 12 )yt = (1 - B)(yt - yt-12 ) = (1 - B)(β0 + β1t2 + εt ) = β1t2 - β1 (t - 1)2 + εt - εt-1
= -β1 + 2β1t + εt - εt-1 ,
then we can see that after applying the diﬀerences, the series still contains a linear trend. Therefore, these two diﬀerences are not suﬃcient to make the series stationary. We should apply another ﬁrst diﬀerence to get a stationary time series.
The approximate sampling distribution for the autocorrelations (rk ) from a white noise series is
rk ～N ╱0, ← , k = 1, 2, . . . ,
where T is the length of the time series.
The 95% conﬁdence interval for the autocorrelations are given by
= (-0.0877, 0.0877).
The ﬁrst three sample autocorrelations are outside this conﬁdence interval.
The moving average models are always stationary. For an MA(2) model to be invertible:
θ 1 + θ2 > -1, θ1 - θ2 < 1, |θ2 | < 1,
where yt = c + θ 1 εt-1 + θ2 εt-2 + εt , and c, θ 1 and θ2 are constants, and εt is a white noise series.
For the given example, θ 1 = 1.2, θ2 = 0.8.
θ 1 + θ2 = 2 > -1
θ 1 - θ2 = 0.4 < 1
|θ2 | = 0.8 < 1
All there parameter restrictions are fulﬁlled by this model. Hence it is a stationary and invertible model.
AIC represents a trade-oﬀ between the error sum of squares and the no. of parameters in the model (a penalty for the complexity of the model). If the error sum of squares decreases by a small amount when a new term is included in the model, the value of
AIC will increase.
The forecast function for the ETS(A,N,N) model is
yˆT+hIT = lT , h = 1, 2, . . . ,
where lT is the level estimated at time T. Hence the forecast function for the ETS(A,N,N) model is ﬂat. The h-step ahead forecast variance for ETS(A,N,N) is given by σ2 [1 + α2 (h - 1)]. Hence the width of the prediction interval increases with the forecast horizon.
The time plot shows monthly data having a strong seasonal pattern. Therefore, we expect to see spikes in the ACF plot at lags which are multiples of 12. However, the given ACF plot shows spikes roughly at multiples of 10.
a ● The series has an upward trend.
● There is a strong seasonal pattern. The seasonal variation increases proportionately to the level of the series (i.e., multiplicative seasonality).
● There is a sudden drop in 2020 Q2 due to the COVID-19 lockdown in New Zealand.
● The sales are highest in Q4 due to Christmas, and the lowest seems to be around Q2–Q3.
● The sudden drop in 2020 Q2 is also visible in the seasonal and subseries plots.
● The average sales and growth rates are similar in Q1–Q3. The average sales for Q2 can be impacted slightly due to the COVID-19 lockdown.
b ● An STL decomposition has been performed on the sales data where the panels of
Figure 4 show the estimated trend-cycle, seasonal and remainder components.
● STL is an additive decomposition method. A log transformation has been applied before the decomposition to account for the multiplicative seasonality.
● The trend component shows a drop towards the end of the time series.
● The seasonal component changes slightly with time until 2015. The shape of the seasonal pattern has changed after 2015.
● The trend component has a smooth upward trend.
● The seasonal component changes slightly with time, and the seasonal pattern has not changed much compared to setting 1.
● The remainder component in both settings shows a large drop in 2020 Q1 due to the COVID-19 lockdown.
● Setting robust = TRUE has completely moved the impact of this extreme observa- tion to the remainder component.
For analyzing the long-term trend, I would prefer setting 2. Even though there is some uncertainty about future COVID-19 lockdowns, the trend towards the end of the series is quite unlikely to decrease, as shown in setting 1.
a ● The time series contains trend, seasonal and cyclic patterns.
● There are drops around 1991, 2001, 2007 due to recessions and 2020 due to COVID-19.
b i Both models have additive seasonality and additive trends. For the damped model, a damped trend is ﬁtted. For the additive model, the error is additive and for the damped model, the error is multiplicative.
ii ETS(A,A,A) model
● is moderately large. The level often changes with time.
● βˆ is quite small. Even though the slope changes with time, it changes slightly.
● is very small. The seasonal component hardly changes with time.
● The remainder component shows a large residual due to the disruption of the COVID-19 outbreak.
iii fit %>% select(additive) %>% gg_tsresiduals()
features(.resid, ljung_box, lag = 24, dof = 16)
iv fit %>% select(damped) %>% gg_tsresiduals()
features(.innov, ljung_box, lag = 24, dof = 17)
v I would choose the damped model.
● The forecasts from the additive model show an estimated upward trend, although the series seems to ﬂatten out towards the end. In contrast, the damped model generates forecasts with a damping trend.
● The shape of the seasonal patterns is quite diﬀerent in both models. The forecasts from the additive model have one seasonal spike, whereas the damped model has two spikes. The latter is more aligned with the seasonal pattern that we can observe towards the end of the time series.
● The prediction intervals are also wider for the damped model. It indicates that there is a lot of uncertainty present in these forecasts. It is more realistic with the current COVID-19 outbreak.
yt = (lt-1 + 0.879bt-1 + st-12 )(1 + εt )
lt = lt-1 + 0.879bt-1 + 0.754(lt-1 + 0.879bt-1 + st-12 )εt bt = 0.879bt-1 + 0.217(lt-1 + 0.879bt-1 + st-12 )εt
st = st-12 + 0.24(lt-1 + 0.879bt-1 + st-m )εt
where εt ～ N(0, 2 ) and 2 is small and shown in the output as zero due to rounding.
a ● I would apply both a seasonal and ﬁrst diﬀerencing to obtain a stationary time series.
● The time plot of non-seasonally diﬀerenced series show changes in level with time, and the ACF plot decays slowly as the lags increase.
● The plot of the seasonally diﬀerenced series shows a strong seasonal pattern, and the autocorrelations computed at lags which are multiples of 12 decay, slowly.
● Due to these features, the seasonally diﬀerenced series and non-seasonally diﬀer- enced series appear to be non-stationary.
b i The partial autocorrelations at lags which are multiples of 12 are decaying, slowly.
The autocorrelations at seasonal lags show a sharp-cut oﬀ after lag 24. These features resemble a ARIMA(0,1,0)(0,1,2)12 model.
ii Deﬁne employees time series by yt .
(1 - B)(1 - B 12 )yt = (1 - 0.5482B - 0.1959B2 )εt ,
where εt ～N(0, 6.721e - 05).
yt = yt-1 + yt-12 - yt-13 - 0.5482εt-1 - 0.1959εt-2 + εt yˆT+1IT = yT + yT -11 - yT -12 - 0.5482εˆT - 0.1959εˆT -1
= 3.13 + 3.03 - 3.05 + (0.5482 x 0.0000886) - (0.1959 x 0.0153) = 3.11.
The forecast for April 2021 is 3.11.
Assuming that the forecast errors are Gaussian:
yˆT+1IT 土 1.96
3.11 土 1.96 ′6.721e - 05 (3.09,3.13)