Exercises from Section 5 of Forecasting: Principles & Practice at https://otexts.com/fpp3/expsmooth-exercises.html

Exercise 8.1

Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

(a) Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and ℓ0, and generate forecasts for the next four months.

First a quick look at the raw data.

pigs_vic <- aus_livestock %>%
  filter(State == 'Victoria',
           Animal == 'Pigs')


pigs_vic %>%
  autoplot()

I build an ANN simple exponential smoothing model below and call report() on it to see selected values. We see an alpha of 0.3221247 was selected and a l_0 (initial states section) of 100,646.6.

# Estimate parameters
fit <- pigs_vic %>%
  model(ETS(Count ~ error("A") + trend("N") + season("N")))

# Output model fit
report(fit)
## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3221247 
## 
##   Initial states:
##      l[0]
##  100646.6
## 
##   sigma^2:  87480760
## 
##      AIC     AICc      BIC 
## 13737.10 13737.14 13750.07

Using the above model we forecast 4 values ahead and see the forecast for each month will be 95,186.56.

fc <- fit %>%
  forecast(h = 4)

fc
## # A fable: 4 x 6 [1M]
## # Key:     Animal, State, .model [1]
##   Animal State    .model                          Month             Count  .mean
##   <fct>  <fct>    <chr>                           <mth>            <dist>  <dbl>
## 1 Pigs   Victoria "ETS(Count ~ error(\"A\") +~ 2019 Jan N(95187, 8.7e+07) 95187.
## 2 Pigs   Victoria "ETS(Count ~ error(\"A\") +~ 2019 Feb N(95187, 9.7e+07) 95187.
## 3 Pigs   Victoria "ETS(Count ~ error(\"A\") +~ 2019 Mar N(95187, 1.1e+08) 95187.
## 4 Pigs   Victoria "ETS(Count ~ error(\"A\") +~ 2019 Apr N(95187, 1.1e+08) 95187.

(b) Compute a 95% prediction interval for the first forecast using y-hat±1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

The prediction interval bounds are from 113,502.1 to 78,871.01 when calculated mathematically from the fit values.

# first predicted value
pred_1 <- mean(fc$.mean[1])

# get residuals
aug_fit <- augment(fit)

# calc SD
sd <- sd(aug_fit$.resid)

# Calculate the 95% prediction intervals
pred_1 + (sd * 1.96)
## [1] 113502.1
pred_1 - (sd * 1.96)
## [1] 76871.01

The prediction interval using the hilo() function on our predicted values is similar but not exactly the same, ranging from 76,854.79 to 113,518.3.

# use hil() to convert predictions into intervals
fc_hilo <- fc %>% hilo()

# Output model interval values
fc_hilo$`95%`[1]
## <hilo[1]>
## [1] [76854.79, 113518.3]95
fc %>%
  autoplot(pigs_vic) +
  labs(y="Count", title="Pigs Slaughtered in Victoria, Australia - by month") +
  guides(colour = "none")

Exercise 8.5

Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

ireland_econ <- global_economy %>%
  filter(Country == "Ireland")

(a) Plot the Exports series and discuss the main features of the data.

Looking at Ireland’s economy export data there is an upward trend that has only dipped significantly around the mid 2000s but this was followed by a steep include and recovery that almomst appears to have regained the trend/momentum.

ireland_econ %>%
  autoplot(Exports)

(b) Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

Below I fit the model, forecast 6 years, and plot these forecasts.

# Estimate parameters
fit_ann <- ireland_econ %>%
  model(ETS(Exports ~ error("A") + trend("N") + season("N")))


# forecast 6 years
fc_ann <- fit_ann %>%
  forecast(h = 6)

fc_ann %>%
  autoplot(ireland_econ) +
  labs(y="Exports of good and services (% GDP)", title="Ireland's Exports, ETS(A,N,N)") +
  guides(colour = "none")

(c) Compute the RMSE values for the training data.

Using the accuracy() function I see the RMSE of 4.02662.

#show RMSE for model
accuracy(fit_ann)
## # A tibble: 1 x 11
##   Country .model           .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <fct>   <chr>            <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Ireland "ETS(Exports ~ ~ Trai~  1.56  4.03  2.94  2.25  4.64 0.983 0.991 0.287

(d) Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

For the ETS(A,A,N) model the RMSE if 3.738005. This is quite a bit lower than the RMSE for the ETS(A, N, N) model which means it is the favored model by this measure.

# Estimate parameters
fit_aan <- ireland_econ %>%
  model(ETS(Exports ~ error("A") + trend("A") + season("N")))


# forecast 6 years
fc_aan <- fit_aan %>%
  forecast(h = 6)


accuracy(fit_aan)
## # A tibble: 1 x 11
##   Country .model          .type    ME  RMSE   MAE    MPE  MAPE  MASE RMSSE  ACF1
##   <fct>   <chr>           <chr> <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Ireland "ETS(Exports ~~ Trai~ 0.345  3.74  2.80 -0.107  4.55 0.936 0.920 0.286
fc_aan %>%
  autoplot(ireland_econ) +
  labs(y="Exports of good and services (% GDP)", title="Ireland's Exports, ETS(A, A, N)") +
  guides(colour = "none")

(e) Compare the forecasts from both methods. Which do you think is best?

In reviewing the two visualizations of the predictions above, the ETS(A, A, N) seems to better capture the upward trend of the export data while the ETS(A, N, N) is static and likely underselling our Irish friends. Further, the lower RMSE score of the ETS(A, A, N) is another point in it’s favor - so that would be my selection with the data available.

(f) Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

The ETS(A, N, N) model has a prediction interval from 112.67 - 127.35 (span 14.68) while the ETS(A, A, N) model has a prediction interval from 113.87 - 128.59 (span 14.72).

### ANN first
# first predicted value
pred_1 <- mean(fc_ann$.mean[1])

# get residuals
aug_fit_ann <- augment(fit_ann)

# calc SD
sd_ann <- sd(aug_fit_ann$.resid)

# Calculate the 95% prediction intervals
pred_1 + (sd_ann * 1.96)
## [1] 127.3518
pred_1 - (sd_ann * 1.96)
## [1] 112.6736
### AAN second
# first predicted value
pred_1 <- mean(fc_aan$.mean[1])

# get residuals
aug_fit_aan <- augment(fit_aan)

# calc SD
sd_aan <- sd(aug_fit_aan$.resid)

# Calculate the 95% prediction intervals
pred_1 + (sd_aan * 1.96)
## [1] 128.5897
pred_1 - (sd_aan * 1.96)
## [1] 113.8719

Next using the hilo() function we find the intervals again.

For the ETS(A, N, N) this interval is 111.98 - 128.04 (span 16.06) and for ETS(A, A, N) an interval of 113.64 - 128.82 (span 15.18). This span a bit larger than the intervals calculated above, but are quite similar considering the scale of this dataset.

### ANN first
# use hilo() to convert predictions into intervals
fc_hilo_ann <- fc_ann %>% hilo()

# Output model interval values
fc_hilo_ann$`95%`[1]
## <hilo[1]>
## [1] [111.981, 128.0444]95
### AAN second
# use hilo() to convert predictions into intervals
fc_hilo_aan <- fc_aan %>% hilo()

# Output model interval values
fc_hilo_aan$`95%`[1]
## <hilo[1]>
## [1] [113.6379, 128.8236]95

Exercise 8.6

Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]

First a look at the raw data. There is very steep upward trend started around 2000 with little severe outliers and no seasonality. This means we won’t need the Holt-Winters’ methods (A, A, and A, M, A_d, M) that would suite data with seasonality. Further we certainly need a model that captures the trend, so simple exponential smoothing (N, N) also isn’t best. That leaves us with Holt’s linear method (A, N) and add additive damped trend method (A_d, N) as possible options.

china_gdp <- global_economy %>%
  filter(Country == "China")

china_gdp %>%
  autoplot(GDP) +
  labs(title = "China's GDP - by year", y = "GDP")

Below I model and plot the Holt linear method and three versions of additive tamped trend methods with varying phi levels. Our textbooks states phi is rarely less than 0.8 as the damping has a very strong effect for smaller values and rarely more than 0.98 as closer to 1 it becomes hard to see the damping at all.

Looking at the models graphed below, the box-cox transformation has logged the data (lambda -0.034) and then displayed all of the models. The ETS box-cox model is the most optimistic with an expectation of runaway growth even greater than the trend appears after being logged. As expected the Holt Linear forecast continues the trend as is with no change in slope. Next we see the three damping method where the highest values of phi correspond with the slightest damping, such as the green line with phi = 0.95. In this scenario my gut would be to choose the additive damping model with phi = 0.95 because is seems to capture the upward trend but also acknowledged most things cannot continue growing at the same rate forever. That’s more of a gut instinct at this point, but knowing more about the Chinese economy would help.

#determine lambda
china_gdp_lambda <- china_gdp %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)


#create 4 models to compare
china_add_damp <-
    china_gdp %>%
    model(
        holt_linear = ETS(GDP ~ error('A') + trend('A') + season('N')),
        ets_boxcox = ETS(box_cox(GDP, china_gdp_lambda)),
        add_damp_0.8 = ETS(GDP ~ error('A') + trend('Ad', phi = 0.8) + season('N')),
        add_damp_0.87 = ETS(GDP ~ error('A') + trend('Ad', phi = 0.87) + season('N')),
        add_damp_0.95 = ETS(GDP ~ error('A') + trend('Ad', phi = 0.95) + season('N'))
    )

#forecast and plot all
china_add_damp %>% 
    forecast(h = 10) %>% 
    autoplot(china_gdp, level = NULL)

Exercise 8.7

Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

Taking a look at the raw data below we see what I’ve come to think of as ‘varying variance’. There is clear seasonality here, but the lows and highs of that seasonality is increasing over the course of the time. This is precisely why multiplicative seasonality is necessary, as it will reflect the seasonality as more of a proportion than a constant.

aus_production %>%
  autoplot(Gas)

Below I’ve plotted a standard multiplicative Holt-Winter (A, A, M) forecast and a damped multiplicative Holt-Winter (A, A_d, M, phi = 0.85).

As expected the standard multiplicative Holt-Winter (A, A, M) forecast continues the seasonality at the degree it has been observed most recently and incorporates the upward trend. The damped version of this is similar but begins to flatten out the upward trend.

#create 2 models to compare
gas_models <-
    aus_production %>%
    model(
        mult_holtwinter = ETS(Gas ~ error('A') + trend('A') + season('M')),
        mult_holtwinter_damp_0.85 = ETS(Gas ~ error('A') + trend('Ad', phi = 0.85) + season('M'))
    )

#forecast 4 years and plot all
gas_models %>% 
    forecast(h = 16) %>% 
    autoplot(aus_production, level = NULL)

Checking the RMSE values on the model it appears that the non-damped model performs better according to this measure of accuracy.

gas_models %>%
  accuracy() %>%
  select(.model, RMSE)
## # A tibble: 2 x 2
##   .model                     RMSE
##   <chr>                     <dbl>
## 1 mult_holtwinter            4.19
## 2 mult_holtwinter_damp_0.85  4.29

Exercise 8.8

Recall your retail time series data (from Exercise 8 in Section 2.10).

(a) Why is multiplicative seasonality necessary for this series?

Look at the plot below we see seasonality that has very small highs and lows early on, but as time continues the highs and lows vary while on the whole increasing in magnitude - this means multiplicative seasonality will be necessary to captures those fluctuations.

set.seed(8675309)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

myseries %>%
  autoplot()

(b) Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

Below we see both models and as expected, the damped version has the same shape but doesn’t follow the trend quite as high.

#create 2 models to compare
myseries_models <-
    myseries %>%
    model(
        mult_holtwinter = ETS(Turnover ~ error('A') + trend('A') + season('M')),
        mult_holtwinter_damp_0.85 = ETS(Turnover ~ error('A') + trend('Ad', phi = 0.85) + season('M'))
    )

#forecast 3 years and plot all
myseries_models %>% 
    forecast(h = 36) %>% 
    autoplot(myseries, level = NULL)

(c) Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

The RMSE are close between the two models, but the damped method is slightly lower at 6.60 so this is the model to choose based off this, further it seems more visually intuitive that the trend won’t continue to increase forever at the exact same rate.

myseries_models %>% 
    accuracy() %>% 
    select(.model, RMSE)
## # A tibble: 2 x 2
##   .model                     RMSE
##   <chr>                     <dbl>
## 1 mult_holtwinter            6.68
## 2 mult_holtwinter_damp_0.85  6.60

(d) Check that the residuals from the best method look like white noise.

The histogram looks fairly normal, but on the ACF plot I do see some residuals out of the normal range with some regularity. There may be some lag that isn’t captured in our model but I don’t believe it’s too severe.

myseries_models %>%
  select(mult_holtwinter_damp_0.85) %>%
  gg_tsresiduals()

(e) Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naĂ¯ve approach from Exercise 7 in Section 5.11?

Below I split the data, pull in my previous seasonal naive model and forecast and plot. We can see that the forecast misses the increase we see in that actual test data.

# create training set
myseries_train <- myseries %>%
  filter(year(Month) < 2011)

# model from past homework
fit_snaive <- myseries_train %>%
          model(SNAIVE(Turnover))

fc <- fit_snaive %>%
  forecast(new_data = anti_join(myseries, myseries_train))

fc %>% autoplot(myseries)

When we look at the RMSE is is a high 77.35 compared to the 6.60 achieved by my mult_holtwinter_damp_0.85 model. So in this case our more complex multiplicative Holt-Winter Damped method is worth that complexity as it outperforms the simple seasonal naive model.

fc %>%
  accuracy(myseries)
## # A tibble: 1 x 12
##   .model    State Industry .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>     <chr> <chr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 SNAIVE(T~ Quee~ Takeawa~ Test   71.6  77.4  71.9  23.4  23.5  6.23  5.16 0.861

Exercise 8.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

First the STL decomposition after a Box-Cox transformation with lambda = 0.24.

lambda <- myseries_train %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

#apply transformation
myseries_train_boxcox <-
    myseries_train %>% 
    mutate(turnover_boxcox = box_cox(Turnover, lambda))

#STL decomposition
myseries_train_boxcox %>%
  model(
    STL(turnover_boxcox ~ trend(window = 7) +
                   season(window = "periodic"),
    robust = TRUE)) %>%
  components() %>%
  autoplot()

Below is the ETS forecast compared to test data after applying the Box-Cox transformation on training data. It appears this model was a bit low in predictions around the middle of the forecast timespan.

#try ETS on seasonally adjusted data
myseries_train_boxcox_ets_fit <- myseries_train %>%
  model(ETS(box_cox(Turnover, lambda)))

myseries_train_boxcox_ets_fc <- myseries_train_boxcox_ets_fit %>%
  forecast(new_data = anti_join(myseries, myseries_train))

myseries_train_boxcox_ets_fc %>% autoplot(myseries)

We see an RMSE of 32.93 on this. It’s better than the seasonal naive but not as good as the multiplicative seasonality, damping or not, in exercise 8.7 (though that was trained on the full dataset, not just a training subset).

myseries_train_boxcox_ets_fc %>%
  accuracy(myseries)
## # A tibble: 1 x 12
##   .model    State Industry .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>     <chr> <chr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 ETS(box_~ Quee~ Takeawa~ Test   24.7  32.9  26.6  8.09  8.78  2.31  2.20 0.836