8.1, 8.5, 8.6, 8.7, 8.8, 8.9 in Hyndman (https://otexts.com/fpp3/expsmooth-exercises.html)
library(caret)
library(corrplot)
library(dplyr)
library(e1071)
library(fable)
library(fpp2)
library(fpp3)
library(ggplot2)
library(GGally)
library(mlbench)
library(MASS)
library(purrr)
library(patchwork)
library(stats)
library(tsibble)
library(tidyr)
pigs <- aus_livestock %>%
filter(Animal == 'Pigs') %>%
filter(State == 'Victoria')
(a) Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of \(α\) and \(ℓ_0\), and generate forecasts for the next four months.
#Simple exponential smoothing
fit <- pigs %>%
model(ETS(Count ~ error("A") + trend("N") + season("N")))
# Optimal alpha = 0.3221247, l = 100,646.6
report(fit)
## Series: Count
## Model: ETS(A,N,N)
## Smoothing parameters:
## alpha = 0.3221247
##
## Initial states:
## l[0]
## 100646.6
##
## sigma^2: 87480760
##
## AIC AICc BIC
## 13737.10 13737.14 13750.07
#Forecast
fc <- fit %>%
forecast(h = "4 months")
fc %>%
autoplot(pigs) +
geom_line(aes(y = .fitted), col="red", data = augment(fit)) +
labs(title="Pigs Slaughtered in Victoria, Australia", subtitle = "4-Month Exponential Smoothing Forecast using ETS(A,N,N)")
Since there is no trend or seasonality within the data, simple exponential smoothing suffices here. The optimal value of \(α\) is 0.3221247 and \(ℓ_0\) is 100,646.6. Since \(α\) is closer to 0 than it is to 1, it means the forecast still considers prior data with heavy weight. The model chosen was ETS(A,N,N), so an additive error but no trend or seasonal components, thus the forecast will not change.
(b) Compute a 95% prediction interval for the first forecast using \(\hat{y}\) ± 1.96\(s\) where \(s\) is the standard deviation of the residuals. Compare your interval with the interval produced by R.
# first forecast
first_fc <- (fit |> forecast(h = 1)) |> pull(.mean)
# standard deviation
s <- sd(augment(fit)$.resid)
# 95% prediction interval
upper <- first_fc + 1.96 * s
lower <- first_fc - 1.96 * s
cat("95% Prediction Interval (Manual): [", lower, "," , upper, "] \n")
## 95% Prediction Interval (Manual): [ 76871.01 , 113502.1 ]
# R interval
interval <- fc %>%
hilo(95) %>%
pull('95%') %>%
head(1)
lower_bound <- as.numeric(gsub("\\[|\\]", "", strsplit(as.character(interval), ",")[[1]][1]))
upper_bound <- as.numeric(gsub("\\[|\\]", "", strsplit(as.character(interval), ",")[[1]][2]))
# Print the 95% prediction interval
cat("95% Prediction Interval (R): [", lower_bound, ", ", upper_bound, "]")
## 95% Prediction Interval (R): [ 76854.79 , 113518.3 ]
These two intervals are very close in terms of lower and upper bounds.
#Filter data
Mexico_exports <- global_economy %>%
filter(Country == "Mexico")
head(Mexico_exports)
## # A tsibble: 6 x 9 [1Y]
## # Key: Country [1]
## Country Code Year GDP Growth CPI Imports Exports Population
## <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Mexico MEX 1960 13040000000 NA 0.0129 11.7 8.51 38174112
## 2 Mexico MEX 1961 14160000000 5.00 0.0131 10.6 8.41 39394126
## 3 Mexico MEX 1962 15200000000 4.66 0.0133 10.1 8.57 40649588
## 4 Mexico MEX 1963 16960000000 8.11 0.0133 9.95 8.32 41939880
## 5 Mexico MEX 1964 20080000000 11.9 0.0137 9.85 7.63 43264272
## 6 Mexico MEX 1965 21840000000 7.10 0.0141 9.52 7.63 44623043
(a) Plot the Exports series and discuss the main features of the data.
Mexico_exports %>%
autoplot(Exports) +
labs(y="% of GDP", title="Exports from Mexico")
I chose to analyze exports from Mexico, and I see that the overall trend has increased over time, with some random noise. As the data is collected at an annual level, we will not see sesonality but we can see cyclicity. Mexico’s exports hovered around 10% until 1980, after which it began rising until 1990, then dropped for 3 years before beginning to rise to 30% at 2010 as Mexico’s economy began to grow.
(b) Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.
#Simple exponential smoothing
fit_1 <- Mexico_exports %>%
model(ETS(Exports ~ error("A") + trend("N") + season("N")))
#Optimal values
report(fit_1)
## Series: Exports
## Model: ETS(A,N,N)
## Smoothing parameters:
## alpha = 0.9998997
##
## Initial states:
## l[0]
## 8.503378
##
## sigma^2: 4.8073
##
## AIC AICc BIC
## 330.5385 330.9829 336.7198
#Forecast
fc_1 <- fit_1 %>%
forecast(h = "10 years")
#Plot data + forecast and fitted values
fc_1 %>%
autoplot(Mexico_exports) +
geom_line(aes(y = .fitted), col="red", data = augment(fit_1)) +
labs(y="% of GDP", title="Exports from Mexico", subtitle = "10-Year Exponential Smoothing Forecast using ETS(A,N,N)")
(c) Compute the RMSE values for the training data.
accuracy(fit_1)
## # A tibble: 1 × 11
## Country .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <fct> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Mexico "ETS(Exports ~ … Trai… 0.506 2.15 1.38 1.83 7.78 0.983 0.991 0.203
The Root Mean Square Error is 2.154425, which is the average deviation between fitted values and the actual values.
(d) Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.
fit_2 <- Mexico_exports %>%
model(ETS(Exports ~ error("A") + trend("A") + season("N")))
#Forecast
fc_2 <- fit_2 %>%
forecast(h = "10 years")
#Plot data + forecast and fitted values
fc_2 %>%
autoplot(Mexico_exports) +
geom_line(aes(y = .fitted), col="red", data = augment(fit_2)) +
labs(y="% of GDP", title="Exports from Mexico", subtitle = "10-Year Exponential Smoothing Forecast using ETS(A,A,N)")
accuracy(fit_2)
## # A tibble: 1 × 11
## Country .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <fct> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Mexico "ETS(Exports… Trai… -0.00816 2.09 1.41 -1.97 8.68 1.01 0.964 0.203
Now that I have an ETS(A,N,N) and ETS(A,A,N) forecast- they look completely indistinguishable side by side. The only difference being that the second forecast has an additive trend component, so the forecast is increasing over the time period I’ve chosen, and that the confidence intervals shift with it. The RMSE is 2.154425 for ETS(A,N,N), and it is 2.093999 for the ETS(A,A,N) model, which tells me this model performs slightly better when we include an additive trend component, especially since there is an overall rising trend in the data. The Mean Absolute Error (MAE) has gone up slightly (1.375397 vs 1.407705), as well as the Mean Absolute Percentage Error (7.782879 vs 8.682648), which implies that the trend is not strong or consistent, which is visible at a glance. The ETS(A,A,N) model improves forecasting accuracy slightly, although only marginally.
(e) Compare the forecasts from both methods. Which do you think is best?
# Plot forecasts
Mexico_exports %>%
model(ANN = ETS(Exports ~ error("A") + trend("N") + season("N")),
AAN = ETS(Exports ~ error("A") + trend("A") + season("N"))) %>%
forecast(h = "10 years") %>%
autoplot(Mexico_exports, level = NULL)
The ETS(A,A,N) method seems to be better since it shows the increasing trend in data, while the ETS(A,N,N) method suggests the GDP will stay the same. Given that the models have similar RMSE, MAE, MAPE, etc., I think it better to choose the model that more accurately demonstrates what the immediate trend is.
(f) Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.
################################################################################
##### Manual #####
# Forecast 1 ETS(A,N,N)
# first forecast
first_fc_1 <- fc_1$.mean[1]
# RMSE
residuals_1 <- fit_1 %>%
augment() %>%
mutate(residuals = .resid)
RMSE_1 <- sqrt(mean(residuals_1$residuals^2))
# Standard error
se_1 <- RMSE_1
# 95% prediction interval for first forecast
lower_bound_1m <- first_fc_1 - 1.96 * se_1
upper_bound_1m <- first_fc_1 + 1.96 * se_1
################################################################################
##### Manual #####
# Forecast 2 ETS(A,A,N)
# first forecast
first_fc_2 <- fc_2$.mean[1]
# RMSE
residuals_2 <- fit_2 %>%
augment() %>%
mutate(residuals = .resid)
RMSE_2 <- sqrt(mean(residuals_2$residuals^2))
# Standard error
se_2 <- RMSE_2
# 95% prediction interval for first forecast
lower_bound_2m <- first_fc_2 - 1.96 * se_2
upper_bound_2m <- first_fc_2 + 1.96 * se_2
################################################################################
##### R #####
# Forecast 1 ETS(A,N,N) #####
interval_1 <- fc_1 %>%
hilo(95) %>%
pull('95%') %>%
head(1)
# 95% prediction interval for first forecast
lower_bound_1r <- as.numeric(gsub("\\[|\\]", "", strsplit(as.character(interval_1), ",")[[1]][1]))
upper_bound_1r <- as.numeric(gsub("\\[|\\]", "", strsplit(as.character(interval_1), ",")[[1]][2]))
################################################################################
##### R #####
# Forecast 2 ETS(A,A,N)
interval_2 <- fc_2 %>%
hilo(95) %>%
pull('95%') %>%
head(1)
# 95% prediction interval for first forecast
lower_bound_2r <- as.numeric(gsub("\\[|\\]", "", strsplit(as.character(interval_2), ",")[[1]][1]))
upper_bound_2r <- as.numeric(gsub("\\[|\\]", "", strsplit(as.character(interval_2), ",")[[1]][2]))
################################################################################
# Results
cat("95% Prediction Interval for ETS(A,N,N) (Manual): [", lower_bound_1m, ",", upper_bound_1m, "]\n")
## 95% Prediction Interval for ETS(A,N,N) (Manual): [ 33.64367 , 42.08901 ]
cat("95% Prediction Interval for ETS(A,N,N) (R): [", lower_bound_1r, ", ", upper_bound_1r, "]\n")
## 95% Prediction Interval for ETS(A,N,N) (R): [ 33.569 , 42.16368 ]
cat("95% Prediction Interval for ETS(A,A,N) (Manual): [", lower_bound_2m, ",", upper_bound_2m, "]\n")
## 95% Prediction Interval for ETS(A,A,N) (Manual): [ 34.278 , 42.48647 ]
cat("95% Prediction Interval for ETS(A,A,N) (R): [", lower_bound_2r, ", ", upper_bound_2r, "]")
## 95% Prediction Interval for ETS(A,A,N) (R): [ 34.12878 , 42.63569 ]
[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]
chinese_gdp <- global_economy %>%
filter(Country == 'China')
# Forecast
chinese_gdp %>%
model(ANN = ETS(GDP ~ error("A") + trend("N") + season("N")),
AAN = ETS(GDP ~ error("A") + trend("A") + season("N"))) %>%
forecast(h = 25) %>%
autoplot(chinese_gdp, level = NULL) +
labs(title="Chinese GDP", subtitle = "25-Year Exponential Smoothing Forecast")
# Damped forecast, from 0.8 to 0.98
chinese_gdp %>%
model('ϕ = 0.8' = ETS(GDP ~ error("A") + trend("Ad", phi = 0.8) + season("N")),
'ϕ = 0.85' = ETS(GDP ~ error("A") + trend("Ad", phi = 0.85) + season("N")),
'ϕ = 0.9' = ETS(GDP ~ error("A") + trend("Ad", phi = 0.9) + season("N")),
'ϕ = 0.98' = ETS(GDP ~ error("A") + trend("Ad", phi = 0.98) + season("N"))) %>%
forecast(h = 25) %>%
autoplot(chinese_gdp, level = NULL) +
labs(title="Chinese GDP", subtitle = "25-Year Exponential Smoothing Damped Forecast")
# Box-Cox
lambda <- global_economy %>%
filter(Country == "China") %>%
features(GDP, features = guerrero) %>%
pull(lambda_guerrero)
# Holts and Box-Cox
fit_chinese_gdp <- chinese_gdp %>%
model('Holt Winters' = ETS(GDP ~ error("A") + trend("A") + season("N")),
'Damped Holt Winters' = ETS(GDP ~ error("A") + trend("Ad", phi = 0.8) + season("N")),
'Box-Cox' = ETS(box_cox(GDP,lambda) ~ error("A") + trend("A") + season("N")),
'Damped Box-Cox' = ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad", phi = 0.8) + season("N")))
fc_chinese_gdp <- fit_chinese_gdp %>%
forecast(h = 15)
fc_chinese_gdp %>%
autoplot(global_economy, level = NULL) +
guides(colour = guide_legend(title = "Forecast")) +
labs(title="Chinese GDP", subtitle = "15-Year Forecast: Box-Cox & Holt-Winters Methods")
Since this dataset shows an increasing trend but no seasonality, we will use additive errors and non-seasonal methods. The ETS(A,A,N) method follows the growth trend that began after 2000, while the ETS(A,N,N) assumes no growth, so it remains the same value as it was in 2017. Dampening the trend allows the data to eventually level off, with varying intensity depending on the \(ϕ\) coefficient. I also see that the Holt Winters methods forecast quite well, although the Box-Cox methods predicts over zealously and is likely overfitting to the increasing trend of the more recent observations.
aus_production %>%
autoplot(Gas) +
labs(title="Australian Gas Production")
# Forecast
aus_production %>%
model(ETS(Gas ~ error("M") + trend("A") + season("M"))) %>%
forecast(h = 25) %>%
autoplot(aus_production, level = NULL) +
labs(title="Australian Gas Production", subtitle = "25-Year Forecast: Multiplicative Holt-Winters Method")
# Damped
fit_aus_production <- aus_production %>%
model('Holt Winters' = ETS(Gas ~ error("M") + trend("A") + season("M")),
'Damped Holt Winters: ϕ = 0.8' = ETS(Gas ~ error("M") + trend("Ad", phi = 0.8) + season("M")),
'Damped Holt Winters: ϕ = 0.9' = ETS(Gas ~ error("M") + trend("Ad", phi = 0.9) + season("M")))
fc_aus_production <- fit_aus_production %>%
forecast(h = 25)
fc_aus_production %>%
autoplot(aus_production, level = NULL) +
guides(colour = guide_legend(title = "Forecast")) +
labs(title="Australian Gas Production", subtitle = "25-Year Forecast: Multiplicative Holt-Winters Methods")
This data shows an overall increasing trend, along with seasonality that appears to be multiplicative, so we can use Holt-Winter methods here. The data also seems apt for a Box-Cox transformation to help normalize the fluctuations that begin after 1970. Multiplicative seasonality is needed here as the size of the fluctuations grows with the level, especially after 1980. Dampening the Holt-Winters model is a subjective matter- if we assume that the seasonality is going to continue rising at a multiplicative rate, then dampening the forecast is un-needed. However, it isn’t always feasible for the seasonality to continue increasing, so dampening allows us to be conservative and let the seasonality grow at a slower, yet still-increasing, rate.
set.seed(22397)
myseries <- aus_retail %>%
filter(`Series ID` == sample(aus_retail$`Series ID`, 1))
autoplot(myseries, Turnover) +
labs(title = "Retail Turnover")
(a) Why is multiplicative seasonality necessary for this series?
This series requires multiplicative seasonality as the fluctuations grow with the level. In the start of the time-series, they’re small, but towards the end, they are wider.
(b) Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
# Forecast
myseries %>%
model(ETS(Turnover ~ error("M") + trend("A") + season("M"))) %>%
forecast(h = 40) %>% #10 years
autoplot(myseries, level = NULL) +
labs(title="Retail Turnover", subtitle = "10-Year Forecast: Multiplicative Holt-Winters Method")
# Damped
fit_aus_retail <- myseries %>%
model('Holt Winters' = ETS(Turnover ~ error("M") + trend("A") + season("M")),
'Damped Holt Winters: ϕ = 0.8' = ETS(Turnover ~ error("M") + trend("Ad", phi = 0.8) + season("M")),
'Damped Holt Winters: ϕ = 0.9' = ETS(Turnover ~ error("M") + trend("Ad", phi = 0.9) + season("M")))
fc_aus_retail <- fit_aus_retail %>%
forecast(h = 40)
fc_aus_retail %>%
autoplot(myseries, level = NULL) +
guides(colour = guide_legend(title = "Forecast")) +
labs(title="Retail Turnover", subtitle = "10-Year Forecast: Multiplicative Holt-Winters Methods")
(c) Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
accuracy(fit_aus_retail)
## # A tibble: 3 × 12
## State Industry .model .type ME RMSE MAE MPE MAPE MASE RMSSE
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Western A… Newspap… Holt … Trai… -0.0918 2.54 1.75 -0.926 6.83 0.421 0.449
## 2 Western A… Newspap… Dampe… Trai… 0.0979 2.55 1.75 -0.0749 6.75 0.419 0.450
## 3 Western A… Newspap… Dampe… Trai… 0.0804 2.55 1.75 -0.163 6.75 0.419 0.450
## # ℹ 1 more variable: ACF1 <dbl>
The RMSE is the lowest for the Multiplicative Holt Winter Method, and slightly larger for the two damped methods I used, but the RMSE values for these three models are so close, that I’d actually choose the damped method where ϕ = 0.8, for its low Mean Absolute Percentage Error, along with its low Mean Absolute Square Error.
(d) Check that the residuals from the best method look like white noise.
myseries %>%
model(ETS(Turnover ~ error("M") + trend("Ad", phi = 0.8) + season("M"))) %>%
gg_tsresiduals()
(e) Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naive approach from Exercise 7 in Section 5.11?
#### Find test set RMSE ####
# Trained set
myseries_train <- myseries %>%
filter(year(Month) < 2011)
# Forecast remaining data
myseries_test_forecast <- myseries_train |>
model(ETS(Turnover ~ error("M") + trend("Ad") + season("M"))) |>
forecast(new_data = anti_join(myseries, myseries_train))
# RMSE = 8.116245
myseries_test_forecast %>%
accuracy(myseries)
## # A tibble: 1 × 12
## .model State Industry .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(Tur… West… Newspap… Test 4.91 8.12 6.53 10.6 17.3 1.59 1.43 0.829
#### Snaive ####
# Trained set
myseries_train <- myseries %>%
filter(year(Month) < 2011)
# Fit a seasonal naive model
fit_snaive <- myseries_train %>%
model(SNAIVE(Turnover))
# Produce forecasts for the test data
fc_snaive <- fit_snaive %>%
forecast(new_data = anti_join(myseries, myseries_train))
# RMSE = 9.108427
fc_snaive %>%
accuracy(myseries)
## # A tibble: 1 × 12
## .model State Industry .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 SNAIVE(T… West… Newspap… Test 3.04 9.11 7.34 6.06 19.9 1.79 1.61 0.513
The RMSE for the test-set for my series is 8.116245 when training up to 2010. When using a seasonal naive method, I have a forecast accuracy of 9.108427, which is worse than the multiplicative method. Overall the Holt Winters Model performs better for multiplicative seasonal time series than a seasonal naive series.
# Get optimal lambda
lambda <- myseries %>%
features(Turnover, features = guerrero) %>%
pull(lambda_guerrero)
# Box Cox transform
myseries_boxcox <- myseries %>%
mutate(box_cox = box_cox(Turnover, lambda))
# STL decomposition
stl_decomp <- myseries_boxcox %>%
model(STL(box_cox ~ season(window = "periodic"), robust = TRUE)) %>%
components()
# Extract seasonally adjusted component from STL decomposition
myseries$Turnover_sa <- stl_decomp$season_adjust
# Filter training and test datasets
myseries_train <- myseries %>%
filter(year(Month) < 2011)
myseries_test <- myseries %>%
filter(year(Month) >= 2011)
# Fit ETS model to seasonally adjusted training set
myseries$Turnover_sa <- stl_decomp$season_adjust
fit <- myseries_train %>%
model(ETS(Turnover_sa ~ error("M") + trend("A") + season("M")))
# Residuals, ACF plot
fit %>% gg_tsresiduals() +
labs(title="Retail Turnover", subtitle = "Seasonal STL Decomposition after a Box-Cox Transform")
# Forecast on test data
fc <- fit %>%
forecast(new_data = anti_join(myseries, myseries_train))
# RMSE = 0.1693932
fc %>%
accuracy(myseries_test)
## # A tibble: 1 × 12
## .model State Industry .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(Tu… West… Newspap… Test -0.125 0.169 0.145 -3.95 4.54 NaN NaN 0.672