8.1, 8.5, 8.6, 8.7, 8.8, 8.9
8.1 Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.
8.1a Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and ℓ, and generate forecasts for the next four months.
The values are .325 and 96141.
## Series: Count
## Model: ETS(A,N,N)
## Smoothing parameters:
## alpha = 0.324993
##
## Initial states:
## l[0]
## 96141.05
##
## sigma^2: 87443648
##
## AIC AICc BIC
## 13583.23 13583.27 13596.17
## [1] 95182.4
8.1b Compute a 95% prediction interval for the first forecast. Compare your interval with the interval produced by R.
We can see from the report above that the mean is 95183 and the variance is 87367846, thus the sd is the sqrt of the variance = 9347. The confidence interval is mean +- 1.96*sqrt(sd) = (76863, 113503).
The interval matches that calculated by R.
## <hilo[1]>
## [1] [76854.52, 113510.3]95
8.5 Data set global_economy contains the annual Exports from many countries. Select one country to analyse.
8.5a Plot the Exports series and discuss the main features of the data.
I have chosen the US. The data has a general trend up, with occasional reversals of the trend. It is possible the downward reversals represent a cycle, but the cycle is irregular.
8.5b Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.
## Series: Exports
## Model: ETS(A,N,N)
## Smoothing parameters:
## alpha = 0.9998995
##
## Initial states:
## l[0]
## 4.76618
##
## sigma^2: 0.4075
##
## AIC AICc BIC
## 183.2500 183.7028 189.3791
8.5c Compute the RMSE values for the training data.
The training RMSE is .627.
## # A tibble: 1 × 11
## Country .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <fct> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 United States "ETS(Expo… Trai… 0.125 0.627 0.465 1.37 5.10 0.990 0.992 0.239
8.5d Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.
The RMSE is slightly lower at .615.
The AAN model accounts for the trend, which matches almost exactly a line from first to last data point (the green line). The ANN model recognizes the cycling up and down of the data and in the short term is probably better because the data is in a down cycle. AAN better accounts for the overall trend, which is up.
## Series: Exports
## Model: ETS(A,A,N)
## Smoothing parameters:
## alpha = 0.9999
## beta = 0.0001088322
##
## Initial states:
## l[0] b[0]
## 4.669623 0.1158518
##
## sigma^2: 0.4067
##
## AIC AICc BIC
## 185.0267 186.2032 195.2420
## # A tibble: 1 × 11
## Country .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <fct> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 United States "ETS(E… Trai… 0.0108 0.615 0.466 -0.0570 5.19 0.992 0.973 0.238
8.5e Compare the forecasts from both methods. Which do you think is best?
It depends on the business case. Since the data exhibits significant cycling with an inconsistent period, it may be better for very short term business cases (like hiring for next summer) to ignore the trend and perform ANN (note, e.g, that the overall trend is up but the current trend is down). For a longer perspective (say building facilities) I would be more inclined to trust the trend (AAN).
8.5f Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.
ANN: Mean = 11.891, sd = .63834, ci = (10.64, 13.14).
AAN: Mean = 12.001, sd = .63849, ci = (10.75, 13.25) The intervals match those of R. They are very similar - however, it is telling that the parameter that accounts for trend has a slightly larger CI.
## Series: Exports
## Model: ETS(A,N,N)
## Smoothing parameters:
## alpha = 0.9998995
##
## Initial states:
## l[0]
## 4.76618
##
## sigma^2: 0.4075
##
## AIC AICc BIC
## 183.2500 183.7028 189.3791
## [1] 11.89068
## Series: Exports
## Model: ETS(A,A,N)
## Smoothing parameters:
## alpha = 0.9999
## beta = 0.0001088322
##
## Initial states:
## l[0] b[0]
## 4.669623 0.1158518
##
## sigma^2: 0.4067
##
## AIC AICc BIC
## 185.0267 186.2032 195.2420
## [1] 12.00661
## <hilo[1]>
## [1] [10.63951, 13.14186]95
## <hilo[1]>
## [1] [10.75667, 13.25656]95
8,6 Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.
Dampening the trend flattens out the prediction. The lower the phi, the more dampening. Boxcox changes the confidence intervals.
8.7 Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?
Variance in seasonality is increasing over time, so a multiplicative seasonal estimate is needed. The additive model appears better, but this is because the confidence intervals don’t account for the heteroskedasticity. Boxcox improves the model as well.
8.8a Recall your retail time series data (from Exercise 8 in Section 2.10). Why is multiplicative seasonality necessary for this series?
It appears that seasonality has increasing variance, which suggests the need for a multiplicative model. Interestingly, an x11 decomposition does not bear this out, as variance appears to wane again toward the end of the series.
8.8b Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
The damped tend lowers the seasonal cycle, particularly the second one.
8.8c Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
They are very similar (1.3913 vs 1.3955). The RMSE for the undamped trend is lower. I see no reason to prefer the damped trend.
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(Turnover ~ er… Trai… -0.0238 1.39 0.959 -0.441 4.71 0.401 0.408 -0.0249
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(Turnover ~ erro… Trai… 0.119 1.40 0.963 0.429 4.68 0.403 0.410 -0.00772
8.8d Check that the residuals from the best method look like white noise. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?
The residuals do not show any particular patterning. The RMSE for the ETS model is 5, which is lower than the Naive model (7.0).
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 SNAIVE() Test 6.05 7.03 6.11 15.1 15.2 NaN NaN 0.664
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(Turnover ~ error(\… Test 1.86 5.01 3.57 4.07 8.56 NaN NaN 0.850
8.9 For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
The RMSE for the seasonally adjusted data is 1.55 - higher than the other two. It is not lower, and therefore we reject it.
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(trend ~ error(\"A\… Test -1.27 1.55 1.28 -22.4 22.6 NaN NaN 0.972