Forecasting: Principles and
Practice (3rd ed)
Chapter 8 Exponential smoothing
Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset
Use the ETS() function to estimate the equivalent model for simple exponential smoothing.Find the optimal values of α and ℓ0, and generate forecasts for the next four months.
pig_victoria_fit = aus_livestock |>
filter(Animal == 'Pigs', State == 'Victoria') |>
model(ETS(Count ~ error("A") + trend("N") + season("N")))
report(pig_victoria_fit)
## Series: Count
## Model: ETS(A,N,N)
## Smoothing parameters:
## alpha = 0.3221247
##
## Initial states:
## l[0]
## 100646.6
##
## sigma^2: 87480760
##
## AIC AICc BIC
## 13737.10 13737.14 13750.07
pig_victoria_forecast = pig_victoria_fit |>
forecast(h = 4)
pig_victoria_forecast |>
autoplot(aus_livestock)
Compute a 95% prediction interval for the first forecast using ± 1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.
# square root of sigma^2 (SSE) from the model report
pig_victoria_s = sqrt(87480760)
y_hat = pig_victoria_forecast |>
head(1) |>
pull(.mean)
upper = y_hat + 1.96 * pig_victoria_s
lower = y_hat - 1.96 * pig_victoria_s
knit_table(tibble(`95%_lower`=lower, `95%_upper`=upper), 'Computed 95% Prediction Interval')
| 95%_lower | 95%_upper |
|---|---|
| 76854.45 | 113518.7 |
r_ci = pig_victoria_forecast |>
hilo(level = 95) |>
unpack_hilo(`95%`) |>
as_tibble() |>
head(1) |>
select(`95%_lower`, `95%_upper`)
knit_table(r_ci, 'Produced by R 95% Prediction Interval')
| 95%_lower | 95%_upper |
|---|---|
| 76854.79 | 113518.3 |
The manually computed interval has a 0.74 wider range compared to the interval produced by R. Overall these intervals are practically identical.
Data set global_economy contains the annual Exports from many countries. Select one country to analyse.
Plot the Exports series and discuss the main features of the data.
france_economy = global_economy |>
filter(Country == "France")
france_economy |>
autoplot(Exports) +
labs(title='Annual France Exports')
The annual France exports time series as an overall increasing trend with small declines present in the mid 1980s, and early 2000s. The time series does not appear to have any seasonal or cyclical trends.
Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.
france_economy_fit = france_economy |>
model(SES = ETS(Exports ~ error("A") + trend("N") + season("N")))
france_economy_fc = france_economy_fit |>
forecast(h = 10)
france_economy_fc |>
autoplot(france_economy) +
labs(title='Annual France Exports: SES Forecasts')
Compute the RMSE values for the training data.
knit_table(accuracy(france_economy_fit) |> select(RMSE))
| RMSE |
|---|
| 1.152003 |
Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.
france_economy_trend_fit = france_economy |>
model(Holt = ETS(Exports ~ error("A") + trend("A") + season("N")))
france_economy_trend_fc = france_economy_trend_fit |>
forecast(h = 10)
france_economy_trend_fc |>
autoplot(france_economy) +
labs(title='Annual France Exports: Holt Forecasts')
rmse_results = tibble(
`SES ETS(AAN) RMSE` = accuracy(france_economy_fit) |> pull(RMSE),
`Holt ETS(ANN) RMSE` = accuracy(france_economy_trend_fit) |> pull(RMSE),
)
knit_table(rmse_results)
| SES ETS(AAN) RMSE | Holt ETS(ANN) RMSE |
|---|---|
| 1.152003 | 1.11996 |
# accuracy(france_economy_fit)
# accuracy(france_economy_trend_fit)
The RMSE and MAE values for the AAN model is lower by about 0.03. This means that the AAN model or the model that has the additive trend parameter to account for the increasing trend in France’s exports fits the data better than the ANN model. In addition, α being a high shows that the level changes rapidly in order to capture the highly trended series.
Compare the forecasts from both methods. Which do you think is best?
france_economy |>
model(
SES = ETS(Exports ~ error("A") + trend("N") + season("N")),
Holt = ETS(Exports ~ error("A") + trend("A") + season("N"))
) |>
forecast(h = 15) |>
autoplot(france_economy, level = NULL) +
labs(title = "Annual France Exports: Forecast Comparison") +
guides(colour = guide_legend(title = "Forecast"))
report(france_economy_trend_fit)
## Series: Exports
## Model: ETS(A,A,N)
## Smoothing parameters:
## alpha = 0.9475167
## beta = 0.0001000041
##
## Initial states:
## l[0] b[0]
## 13.37335 0.3120665
##
## sigma^2: 1.3472
##
## AIC AICc BIC
## 258.6477 259.8015 268.9499
report(france_economy_fit)
## Series: Exports
## Model: ETS(A,N,N)
## Smoothing parameters:
## alpha = 0.9998995
##
## Initial states:
## l[0]
## 14.3989
##
## sigma^2: 1.3745
##
## AIC AICc BIC
## 257.9200 258.3644 264.1013
Due to the AAN model (Holt’s linear trend method) is the better fit model due to capturing the increasing trend in the data and having a lower RMSE. The ANN ( simple exponential smoothing) does not capture the trend in the series which is supported by the α value being very close to 1. This means the forecasts are almost equal to naïve forecasts and the most weight is given to observations from the more distant past.
Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.
# manual
s_aan = france_economy_trend_fit |> accuracy() |> pull(RMSE)
s_ann = france_economy_fit |> accuracy() |> pull(RMSE)
y_hat_aan = france_economy_trend_fc |> head(1) |> pull(.mean)
y_hat_ann = france_economy_fc |> head(1) |> pull(.mean)
upper_aan = y_hat_aan + 1.96 * s_aan
lower_aan = y_hat_aan - 1.96 * s_aan
upper_ann = y_hat_ann + 1.96 * s_ann
lower_ann = y_hat_ann - 1.96 * s_ann
# r produced
r_ci_aan = france_economy_trend_fc |>
hilo(level = 95) |>
unpack_hilo(`95%`) |>
as_tibble() |>
head(1) |>
select(`95%_lower`, `95%_upper`)
r_ci_ann = france_economy_fc |>
hilo(level = 95) |>
unpack_hilo(`95%`) |>
as_tibble() |>
head(1) |>
select(`95%_lower`, `95%_upper`)
r_ci_table =tibble(
Model = c('ANN', 'ANN', 'AAN', 'AAN'),
Dervived = c('Manual', 'R', 'Manual', 'R'),
Upper = c(upper_ann, r_ci_ann$`95%_upper`, upper_aan, r_ci_aan$`95%_upper`),
Lower = c(lower_ann, r_ci_ann$`95%_lower`, lower_aan, r_ci_aan$`95%_lower`),
Range = c(upper_ann-lower_ann, r_ci_ann$`95%_upper`-r_ci_ann$`95%_lower`, upper_aan-lower_aan, r_ci_aan$`95%_upper`-r_ci_aan$`95%_lower`)
)
knit_table(r_ci_table, 'France Exports: Prediction Interval Comparison')
| Model | Dervived | Upper | Lower | Range |
|---|---|---|---|---|
| ANN | Manual | 33.13971 | 28.62385 | 4.515853 |
| ANN | R | 33.17963 | 28.58393 | 4.595701 |
| AAN | Manual | 33.36920 | 28.97896 | 4.390243 |
| AAN | R | 33.44901 | 28.89915 | 4.549856 |
Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.
[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]
china_gdp = global_economy |>
filter(Country == 'China')
china_gdp |>
autoplot(GDP) +
labs(title='Annual China GDP')
lambda <- china_gdp |>
features(GDP, features = guerrero) |>
pull(lambda_guerrero)
# fit = china_gdp |>
# model(ETS(GDP))
# report(fit)
china_gdp |>
model(`SES` = ETS(GDP ~ error("A") + trend("N") + season("N")),
`Holt` = ETS(GDP ~ error("A") + trend("A") + season("N")),
`Damped Holt` = ETS(GDP ~ error("A") + trend("Ad", phi = 0.8) + season("N")),
`Box-Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad") + season("N")),
`Damped Box-Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad", phi = 0.8) + season("N"))) |>
forecast(h = 15) |>
autoplot(china_gdp, level=NULL) +
labs(title = "Annual China GDP: Forecast Comparison") +
guides(colour = guide_legend(title = "Forecast"))
Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?
aus_production |>
autoplot(Gas) +
labs(title='Quarterly Australia Gas Production')
gas_fit = aus_production |>
model(ETS(Gas))
# report(gas_fit)
aus_gas_fit = aus_production |>
model(
"Multiplicative" = ETS(Gas ~ error('M') + trend('A') + season('M')),
"Damped Multiplicative" = ETS(Gas ~ error('M') + trend('Ad') + season('M'))
)
aus_gas_fit |>
forecast(h = 30) |>
autoplot(aus_production, level=NULL) +
guides(colour = guide_legend(title = "Forecast")) +
labs(title='Quarterly Australia Gas Production: Forecast Comparison')
accuracy(aus_gas_fit)
## # A tibble: 2 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Multiplicative Trai… -0.115 4.60 3.02 0.199 4.08 0.542 0.606 -0.0131
## 2 Damped Multiplicat… Trai… -0.00439 4.59 3.03 0.326 4.10 0.544 0.606 -0.0217
The multiplicative method is preferred in this instance due to the the seasonal variations changing proportional to the level of the series.
Recall your retail time series data (from Exercise 7 in Section 2.10).
Why is multiplicative seasonality necessary for this series?
set.seed(87654321)
myseries <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
autoplot(myseries, Turnover) +
autolayer(myseries)
## Plot variable not specified, automatically selected `.vars = Turnover`
Multiplicative seasonality is necessary for this series due to the the seasonal variations changing proportional to the level of the series.
Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
retail_fit = myseries |>
model(
"Multiplicative" = ETS(Turnover ~ error('M') + trend('A') + season('M')),
"Damped Multiplicative" = ETS(Turnover ~ error('M') + trend('Ad') + season('M'))
)
retail_fit |>
forecast(h = 30) |>
autoplot(myseries, level=NULL) +
guides(colour = guide_legend(title = "Forecast"))
Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
accuracy(retail_fit)
## # A tibble: 2 × 12
## State Industry .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Victor… Footwea… Multi… Trai… 0.273 7.21 4.56 0.0263 4.40 0.537 0.615 0.144
## 2 Victor… Footwea… Dampe… Trai… 0.538 7.18 4.51 0.228 4.31 0.532 0.613 0.115
The Damped Multiplicative is the preferred method due to having a RMSE value (7.18) that is slightly lower RMSE value than the Multiplicative method (7.21)
Check that the residuals from the best method look like white noise.
best_fit = myseries |>
model("Damped Multiplicative" = ETS(Turnover ~ error('M') + trend('Ad') + season('M')))
best_fit |>
gg_tsresiduals() +
ggtitle("Multiplicative")
## Warning: `gg_tsresiduals()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_tsresiduals()` instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
# Applying the Ljung-Box test
augment(best_fit) |> features(.innov, ljung_box, lag=10)
## # A tibble: 1 × 5
## State Industry .model lb_stat lb_pvalue
## <chr> <chr> <chr> <dbl> <dbl>
## 1 Victoria Footwear and other personal accessory retai… Dampe… 24.6 0.00612
# Applying the Box-Pierce test
augment(best_fit) |> features(.innov, box_pierce, lag = 10)
## # A tibble: 1 × 5
## State Industry .model bp_stat bp_pvalue
## <chr> <chr> <chr> <dbl> <dbl>
## 1 Victoria Footwear and other personal accessory retai… Dampe… 24.1 0.00727
The residuals do not appear to look like whote noise based on lag 10 and 12 in the autocorrelation plot exceeding the limit and the Ljung-Box and Box Pierce tests are distinguishable from white noise based on having a p value less than 0.05.
Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?
myseries_train = myseries |>
filter(year(Month) < 2011)
myseries_train_fit = myseries_train |>
model(
"Damped Multiplicative" = ETS(Turnover ~ error('M') + trend('Ad') + season('M')),
'SNATIVE' = SNAIVE(Turnover)
)
myseries_train_forecast = myseries_train_fit |>
forecast(h = 30)
myseries_train_forecast |>
autoplot(myseries_train, level=NULL) +
guides(colour = guide_legend(title = "Forecast"))
accuracy(myseries_train_fit)
## # A tibble: 2 × 12
## State Industry .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Victor… Footwea… Dampe… Trai… 0.326 5.25 3.41 0.270 4.16 0.465 0.514 0.0767
## 2 Victor… Footwea… SNATI… Trai… 5.12 10.2 7.33 5.99 8.82 1 1 0.681
Yes, the Damped Multiplicative beats the seasonal naïve approach from Exercise 7 in Section 5.11 due to have a significantly lower RMSE value.
For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
lambda_retail = myseries |>
features(Turnover, features = guerrero) |>
pull(lambda_guerrero)
# STL Box Cox transformed data
myseries |>
model(STL(box_cox(Turnover,lambda_retail) ~ season(window = "periodic"), robust = TRUE)) |>
components() |>
autoplot() +
ggtitle("STL with Box-Cox Transformation")
Decomp <- myseries |>
model(STL_box = STL(box_cox(Turnover,lambda_retail) ~ season(window = "periodic"), robust = TRUE)) |>
components()
# Seasonally Adjusted
myseries$Turnover_sa <- Decomp$season_adjust
myseries_train <- myseries |>
filter(year(Month) < 2011)
fit = myseries_train |>
model(ETS(Turnover_sa ~ error("M") + trend("Ad") + season("M")))
myseries_train_sa_forecast = fit |>
forecast(h = 30)
myseries_train_sa_forecast |>
autoplot(myseries_train) +
guides(colour = guide_legend(title = "Forecast"))
fit |> gg_tsresiduals() +
ggtitle("Australian Retail Turnover Residual")
# Applying the Ljung-Box test
augment(fit) |> features(.innov, ljung_box, lag=10)
## # A tibble: 1 × 5
## State Industry .model lb_stat lb_pvalue
## <chr> <chr> <chr> <dbl> <dbl>
## 1 Victoria Footwear and other personal accessory retai… "ETS(… 20.4 0.0260
# Applying the Box-Pierce test
augment(fit) |> features(.innov, box_pierce, lag = 10)
## # A tibble: 1 × 5
## State Industry .model bp_stat bp_pvalue
## <chr> <chr> <chr> <dbl> <dbl>
## 1 Victoria Footwear and other personal accessory retai… "ETS(… 19.9 0.0304
accuracy(fit)
## # A tibble: 1 × 12
## State Industry .model .type ME RMSE MAE MPE MAPE MASE RMSSE
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Victoria Footwear… "ETS(… Trai… 0.00347 0.0399 0.0310 0.0900 0.878 0.464 0.485
## # ℹ 1 more variable: ACF1 <dbl>
While this approach does appears to be a better fit to the data due to having a smaller RMSE value, it still does not appear the residuals are not completely white noise and that some structure in the data remains unexplained or there is some seasonal information left in the residuals which should be used in computing forecasts.