Exercises

Forecasting: Principles and Practice (3rd ed)
Chapter 8 Exponential smoothing

8.1

Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset

a

Use the ETS() function to estimate the equivalent model for simple exponential smoothing.Find the optimal values of α and ℓ0, and generate forecasts for the next four months.

pig_victoria_fit = aus_livestock |>
  filter(Animal == 'Pigs', State == 'Victoria') |>
  model(ETS(Count ~ error("A") + trend("N") + season("N")))
report(pig_victoria_fit)
## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3221247 
## 
##   Initial states:
##      l[0]
##  100646.6
## 
##   sigma^2:  87480760
## 
##      AIC     AICc      BIC 
## 13737.10 13737.14 13750.07
pig_victoria_forecast = pig_victoria_fit |>
  forecast(h = 4)

pig_victoria_forecast |>
  autoplot(aus_livestock)

  • Optimal value for α: 0.3221247
  • Optimal value for ℓ0: 100646.6

b

Compute a 95% prediction interval for the first forecast using ± 1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

# square root of sigma^2 (SSE) from the model report
pig_victoria_s = sqrt(87480760)

y_hat = pig_victoria_forecast |>
  head(1) |>
  pull(.mean)

upper = y_hat + 1.96 * pig_victoria_s
lower = y_hat - 1.96 * pig_victoria_s

knit_table(tibble(`95%_lower`=lower, `95%_upper`=upper), 'Computed 95% Prediction Interval')
Computed 95% Prediction Interval
95%_lower 95%_upper
76854.45 113518.7
r_ci = pig_victoria_forecast |>
  hilo(level = 95) |>
  unpack_hilo(`95%`) |>
  as_tibble() |>
  head(1) |>
  select(`95%_lower`, `95%_upper`)

knit_table(r_ci, 'Produced by R 95% Prediction Interval')
Produced by R 95% Prediction Interval
95%_lower 95%_upper
76854.79 113518.3

The manually computed interval has a 0.74 wider range compared to the interval produced by R. Overall these intervals are practically identical.


8.5

Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

a

Plot the Exports series and discuss the main features of the data.

france_economy = global_economy |>
  filter(Country == "France")

france_economy |>
  autoplot(Exports) +
  labs(title='Annual France Exports')

The annual France exports time series as an overall increasing trend with small declines present in the mid 1980s, and early 2000s. The time series does not appear to have any seasonal or cyclical trends.

b

Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

france_economy_fit = france_economy |>
  model(SES = ETS(Exports ~ error("A") + trend("N") + season("N")))

france_economy_fc = france_economy_fit |>
  forecast(h = 10)

france_economy_fc |>
  autoplot(france_economy) +
  labs(title='Annual France Exports: SES Forecasts')

c

Compute the RMSE values for the training data.

knit_table(accuracy(france_economy_fit) |> select(RMSE))
RMSE
1.152003

d

Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

france_economy_trend_fit = france_economy |>
  model(Holt = ETS(Exports ~ error("A") + trend("A") + season("N")))

france_economy_trend_fc = france_economy_trend_fit |>
  forecast(h = 10)

france_economy_trend_fc |>
  autoplot(france_economy) +
  labs(title='Annual France Exports: Holt Forecasts')

rmse_results = tibble(
  `SES ETS(AAN) RMSE` = accuracy(france_economy_fit) |> pull(RMSE),
  `Holt ETS(ANN) RMSE` = accuracy(france_economy_trend_fit) |> pull(RMSE),
  )
knit_table(rmse_results)
SES ETS(AAN) RMSE Holt ETS(ANN) RMSE
1.152003 1.11996
# accuracy(france_economy_fit)
# accuracy(france_economy_trend_fit)

The RMSE and MAE values for the AAN model is lower by about 0.03. This means that the AAN model or the model that has the additive trend parameter to account for the increasing trend in France’s exports fits the data better than the ANN model. In addition, α being a high shows that the level changes rapidly in order to capture the highly trended series.

e

Compare the forecasts from both methods. Which do you think is best?

france_economy |>
  model(
    SES = ETS(Exports ~ error("A") + trend("N") + season("N")),
    Holt = ETS(Exports ~ error("A") + trend("A") + season("N"))
    ) |>
  forecast(h = 15) |>
  autoplot(france_economy, level = NULL) +
  labs(title = "Annual France Exports: Forecast Comparison") +
  guides(colour = guide_legend(title = "Forecast"))

report(france_economy_trend_fit)
## Series: Exports 
## Model: ETS(A,A,N) 
##   Smoothing parameters:
##     alpha = 0.9475167 
##     beta  = 0.0001000041 
## 
##   Initial states:
##      l[0]      b[0]
##  13.37335 0.3120665
## 
##   sigma^2:  1.3472
## 
##      AIC     AICc      BIC 
## 258.6477 259.8015 268.9499
report(france_economy_fit)
## Series: Exports 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.9998995 
## 
##   Initial states:
##     l[0]
##  14.3989
## 
##   sigma^2:  1.3745
## 
##      AIC     AICc      BIC 
## 257.9200 258.3644 264.1013

Due to the AAN model (Holt’s linear trend method) is the better fit model due to capturing the increasing trend in the data and having a lower RMSE. The ANN ( simple exponential smoothing) does not capture the trend in the series which is supported by the α value being very close to 1. This means the forecasts are almost equal to naïve forecasts and the most weight is given to observations from the more distant past.

f

Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

# manual
s_aan = france_economy_trend_fit |> accuracy() |> pull(RMSE)
s_ann = france_economy_fit |> accuracy() |> pull(RMSE)

y_hat_aan = france_economy_trend_fc |> head(1) |> pull(.mean)
y_hat_ann = france_economy_fc |> head(1) |> pull(.mean)

upper_aan = y_hat_aan + 1.96 * s_aan
lower_aan = y_hat_aan - 1.96 * s_aan

upper_ann = y_hat_ann + 1.96 * s_ann
lower_ann = y_hat_ann - 1.96 * s_ann

# r produced
r_ci_aan = france_economy_trend_fc |>
  hilo(level = 95) |>
  unpack_hilo(`95%`) |>
  as_tibble() |>
  head(1) |>
  select(`95%_lower`, `95%_upper`)

r_ci_ann = france_economy_fc |>
  hilo(level = 95) |>
  unpack_hilo(`95%`) |>
  as_tibble() |>
  head(1) |>
  select(`95%_lower`, `95%_upper`)

r_ci_table =tibble(
    Model = c('ANN', 'ANN', 'AAN', 'AAN'),
    Dervived = c('Manual', 'R', 'Manual', 'R'),
    Upper = c(upper_ann, r_ci_ann$`95%_upper`, upper_aan, r_ci_aan$`95%_upper`),
    Lower = c(lower_ann, r_ci_ann$`95%_lower`, lower_aan, r_ci_aan$`95%_lower`),
    Range = c(upper_ann-lower_ann, r_ci_ann$`95%_upper`-r_ci_ann$`95%_lower`, upper_aan-lower_aan, r_ci_aan$`95%_upper`-r_ci_aan$`95%_lower`)
    )

knit_table(r_ci_table, 'France Exports: Prediction Interval Comparison')
France Exports: Prediction Interval Comparison
Model Dervived Upper Lower Range
ANN Manual 33.13971 28.62385 4.515853
ANN R 33.17963 28.58393 4.595701
AAN Manual 33.36920 28.97896 4.390243
AAN R 33.44901 28.89915 4.549856
  • ANN: The interval values manually calculated vs produced by R have have minimal difference in range (about 0.9)
  • AAN: The interval values manually calculated vs produced by R have a greater range difference than the ANN prediction interval range difference at about 0.16. Overall, this is still only a slight variation between the two sources.

8.6

Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]

china_gdp = global_economy |>
  filter(Country == 'China')

china_gdp |>
  autoplot(GDP)  +
  labs(title='Annual China GDP')

lambda <- china_gdp |>
  features(GDP, features = guerrero) |>
  pull(lambda_guerrero)

# fit = china_gdp |>
#   model(ETS(GDP))
# report(fit)

china_gdp |>
  model(`SES` = ETS(GDP ~ error("A") + trend("N") + season("N")),
        `Holt` = ETS(GDP ~ error("A") + trend("A") + season("N")),
        `Damped Holt` = ETS(GDP ~ error("A") + trend("Ad", phi = 0.8) + season("N")),
        `Box-Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad") + season("N")),
        `Damped Box-Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad", phi = 0.8) + season("N"))) |>
  forecast(h = 15) |>
  autoplot(china_gdp, level=NULL) + 
  labs(title = "Annual China GDP: Forecast Comparison") +
  guides(colour = guide_legend(title = "Forecast"))


8.7

Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

aus_production |>
  autoplot(Gas) +
  labs(title='Quarterly Australia Gas Production')

gas_fit = aus_production |>
  model(ETS(Gas))
# report(gas_fit)


aus_gas_fit = aus_production |>
    model(
    "Multiplicative" = ETS(Gas ~ error('M') + trend('A') + season('M')),
    "Damped Multiplicative" = ETS(Gas ~ error('M') + trend('Ad') + season('M'))
  )

aus_gas_fit |>
  forecast(h = 30) |>
  autoplot(aus_production, level=NULL) + 
  guides(colour = guide_legend(title = "Forecast")) +
  labs(title='Quarterly Australia Gas Production: Forecast Comparison')

accuracy(aus_gas_fit)
## # A tibble: 2 × 10
##   .model              .type       ME  RMSE   MAE   MPE  MAPE  MASE RMSSE    ACF1
##   <chr>               <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>
## 1 Multiplicative      Trai… -0.115    4.60  3.02 0.199  4.08 0.542 0.606 -0.0131
## 2 Damped Multiplicat… Trai… -0.00439  4.59  3.03 0.326  4.10 0.544 0.606 -0.0217

The multiplicative method is preferred in this instance due to the the seasonal variations changing proportional to the level of the series.

8.8

Recall your retail time series data (from Exercise 7 in Section 2.10).

a

Why is multiplicative seasonality necessary for this series?

set.seed(87654321)
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

autoplot(myseries, Turnover) +
  autolayer(myseries)
## Plot variable not specified, automatically selected `.vars = Turnover`

Multiplicative seasonality is necessary for this series due to the the seasonal variations changing proportional to the level of the series.

b

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

retail_fit = myseries |>
  model(
    "Multiplicative" = ETS(Turnover ~ error('M') + trend('A') + season('M')),
    "Damped Multiplicative" = ETS(Turnover ~ error('M') + trend('Ad') + season('M'))
  ) 

retail_fit |>
  forecast(h = 30) |>
  autoplot(myseries, level=NULL) + 
  guides(colour = guide_legend(title = "Forecast"))

c

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

accuracy(retail_fit) 
## # A tibble: 2 × 12
##   State   Industry .model .type    ME  RMSE   MAE    MPE  MAPE  MASE RMSSE  ACF1
##   <chr>   <chr>    <chr>  <chr> <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Victor… Footwea… Multi… Trai… 0.273  7.21  4.56 0.0263  4.40 0.537 0.615 0.144
## 2 Victor… Footwea… Dampe… Trai… 0.538  7.18  4.51 0.228   4.31 0.532 0.613 0.115

The Damped Multiplicative is the preferred method due to having a RMSE value (7.18) that is slightly lower RMSE value than the Multiplicative method (7.21)

d

Check that the residuals from the best method look like white noise.

best_fit = myseries |>
  model("Damped Multiplicative" = ETS(Turnover ~ error('M') + trend('Ad') + season('M')))

best_fit |>
  gg_tsresiduals() +
  ggtitle("Multiplicative")
## Warning: `gg_tsresiduals()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_tsresiduals()` instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

# Applying the Ljung-Box test
augment(best_fit) |> features(.innov, ljung_box, lag=10)
## # A tibble: 1 × 5
##   State    Industry                                     .model lb_stat lb_pvalue
##   <chr>    <chr>                                        <chr>    <dbl>     <dbl>
## 1 Victoria Footwear and other personal accessory retai… Dampe…    24.6   0.00612
# Applying the Box-Pierce test
augment(best_fit) |> features(.innov, box_pierce, lag = 10)
## # A tibble: 1 × 5
##   State    Industry                                     .model bp_stat bp_pvalue
##   <chr>    <chr>                                        <chr>    <dbl>     <dbl>
## 1 Victoria Footwear and other personal accessory retai… Dampe…    24.1   0.00727

The residuals do not appear to look like whote noise based on lag 10 and 12 in the autocorrelation plot exceeding the limit and the Ljung-Box and Box Pierce tests are distinguishable from white noise based on having a p value less than 0.05.

e

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

myseries_train = myseries |>
  filter(year(Month) < 2011)

myseries_train_fit = myseries_train |>
  model(
  "Damped Multiplicative" = ETS(Turnover ~ error('M') + trend('Ad') + season('M')),
  'SNATIVE' = SNAIVE(Turnover)
)

myseries_train_forecast = myseries_train_fit |>
  forecast(h = 30)
myseries_train_forecast |>
  autoplot(myseries_train, level=NULL) + 
  guides(colour = guide_legend(title = "Forecast"))

accuracy(myseries_train_fit)
## # A tibble: 2 × 12
##   State   Industry .model .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1
##   <chr>   <chr>    <chr>  <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 Victor… Footwea… Dampe… Trai… 0.326  5.25  3.41 0.270  4.16 0.465 0.514 0.0767
## 2 Victor… Footwea… SNATI… Trai… 5.12  10.2   7.33 5.99   8.82 1     1     0.681

Yes, the Damped Multiplicative beats the seasonal naïve approach from Exercise 7 in Section 5.11 due to have a significantly lower RMSE value.

8.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

lambda_retail = myseries |>
  features(Turnover, features = guerrero) |>
  pull(lambda_guerrero)

# STL Box Cox transformed data
myseries |>
  model(STL(box_cox(Turnover,lambda_retail) ~ season(window = "periodic"), robust = TRUE)) |>
  components() |>
  autoplot() +
  ggtitle("STL with Box-Cox Transformation")

Decomp <- myseries |>
  model(STL_box = STL(box_cox(Turnover,lambda_retail) ~ season(window = "periodic"), robust = TRUE)) |>
  components()

# Seasonally Adjusted
myseries$Turnover_sa <- Decomp$season_adjust

myseries_train <- myseries |>
  filter(year(Month) < 2011)

fit = myseries_train |>
  model(ETS(Turnover_sa ~ error("M") + trend("Ad") + season("M")))


myseries_train_sa_forecast = fit |>
  forecast(h = 30)
myseries_train_sa_forecast |>
  autoplot(myseries_train) + 
  guides(colour = guide_legend(title = "Forecast"))

fit |> gg_tsresiduals()  +
  ggtitle("Australian Retail Turnover Residual")

# Applying the Ljung-Box test
augment(fit) |> features(.innov, ljung_box, lag=10)
## # A tibble: 1 × 5
##   State    Industry                                     .model lb_stat lb_pvalue
##   <chr>    <chr>                                        <chr>    <dbl>     <dbl>
## 1 Victoria Footwear and other personal accessory retai… "ETS(…    20.4    0.0260
# Applying the Box-Pierce test
augment(fit) |> features(.innov, box_pierce, lag = 10)
## # A tibble: 1 × 5
##   State    Industry                                     .model bp_stat bp_pvalue
##   <chr>    <chr>                                        <chr>    <dbl>     <dbl>
## 1 Victoria Footwear and other personal accessory retai… "ETS(…    19.9    0.0304
accuracy(fit)
## # A tibble: 1 × 12
##   State    Industry  .model .type      ME   RMSE    MAE    MPE  MAPE  MASE RMSSE
##   <chr>    <chr>     <chr>  <chr>   <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl>
## 1 Victoria Footwear… "ETS(… Trai… 0.00347 0.0399 0.0310 0.0900 0.878 0.464 0.485
## # ℹ 1 more variable: ACF1 <dbl>

While this approach does appears to be a better fit to the data due to having a smaller RMSE value, it still does not appear the residuals are not completely white noise and that some structure in the data remains unexplained or there is some seasonal information left in the residuals which should be used in computing forecasts.