We use the ETS() function to specify a Simple Exponential Smoothing (SES) model. By setting error(“A”), trend(“N”), and season(“N”), we tell R to treat the series as having no trend and no seasonality, which is exactly what SES does.
library(fpp3)
## Warning: package 'fpp3' was built under R version 4.5.2
## Registered S3 method overwritten by 'tsibble':
## method from
## as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.2 ──
## ✔ tibble 3.3.0 ✔ tsibble 1.1.6
## ✔ dplyr 1.1.4 ✔ tsibbledata 0.4.1
## ✔ tidyr 1.3.1 ✔ feasts 0.4.2
## ✔ lubridate 1.9.4 ✔ fable 0.5.0
## ✔ ggplot2 4.0.2
## Warning: package 'ggplot2' was built under R version 4.5.2
## Warning: package 'tsibble' was built under R version 4.5.2
## Warning: package 'tsibbledata' was built under R version 4.5.2
## Warning: package 'feasts' was built under R version 4.5.2
## Warning: package 'fabletools' was built under R version 4.5.2
## Warning: package 'fable' was built under R version 4.5.2
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tsibble::setdiff() masks base::setdiff()
## ✖ tsibble::union() masks base::union()
# 1. Filter the data for Pigs in Victoria
pigs_vic <- aus_livestock %>%
filter(Animal == "Pigs", State == "Victoria")
# 2. Estimate the SES model
fit <- pigs_vic %>%
model(SES = ETS(Count ~ error("A") + trend("N") + season("N")))
# 3. Extract optimal parameters
report(fit)
## Series: Count
## Model: ETS(A,N,N)
## Smoothing parameters:
## alpha = 0.3221247
##
## Initial states:
## l[0]
## 100646.6
##
## sigma^2: 87480760
##
## AIC AICc BIC
## 13737.10 13737.14 13750.07
# 4. Generate forecasts for the next 4 months
fc <- fit %>%
forecast(h = 4)
View(fc)
-Alpha (\(\alpha = 0.322\)): The smoothing parameter for the level. Since it’s around 0.3, the model is balanced—it values recent data but doesn’t overreact to every small monthly fluctuation in the pig count.
-Initial State (\(\ell_0 = 100,646.6\)): The estimated starting level of the series at the beginning of the dataset.
Here we compare the mathematical approximation of the 95% interval against the internal computation performed by the fable package.
# Get the standard deviation of residuals (sigma)
s <- tidy(fit) %>%
filter(term == "sigma") %>%
pull(estimate)
# Calculate the manual 95% interval for the first forecast
y_hat <- fc$.mean[1]
lower_manual <- y_hat - 1.96 * s
upper_manual <- y_hat + 1.96 * s
# Display results
cat("Manual Interval: [", lower_manual, ",", upper_manual, "]\n")
## Manual Interval: [ , ]
print(fc[1,])
## # A fable: 1 x 6 [1M]
## # Key: Animal, State, .model [1]
## Animal State .model Month
## <fct> <fct> <chr> <mth>
## 1 Pigs Victoria SES 2019 Jan
## # ℹ 2 more variables: Count <dist>, .mean <dbl>
# Filter data for Canada
can_exports <- global_economy %>% filter(Country == "Canada")
# Plot the series
can_exports %>% autoplot(Exports) +
labs(title = "Annual Exports: Canada", y = "% of GDP")
Discussion: The series shows significant volatility and a clear upward trend from the 1960s until roughly 2000, followed by a decline and stabilization. This suggests that a model capable of capturing trend (like ETS(A,A,N)) might outperform simple smoothing.
fit_nn <- can_exports %>% model(ETS(Exports ~ error("A") + trend("N") + season("N")))
fc_nn <- fit_nn %>% forecast(h = 5)
fc_nn %>% autoplot(can_exports) + labs(title = "ETS(A,N,N) Forecast")
Analysis: SES (A,N,N) produces a flat forecast line, assuming the future will simply reflect the most recent average level, ignoring the long-term historical trend.
accuracy(fit_nn) %>% select(RMSE)
The ETS(A,A,N) model (Holt’s Linear Method) allows for a slope component. If your resulting RMSE is lower than the previous 1.617713, it confirms that the trend component successfully captured more variance in the Canadian export data. However, since the data exhibits a structural break (the decline after 2000), a simple linear trend may still struggle to forecast the post-2000 period accurately.
fit_an <- can_exports %>% model(ETS(Exports ~ error("A") + trend("A") + season("N")))
accuracy(fit_an) %>% select(RMSE)
Analysis: ETS(A,N,N) vs. ETS(A,A,N) Performance Comparison: The ETS(A,N,N) model achieved an RMSE of 1.6177 while the ETS(A,A,N) model resulted in 1.6256.
Model Selection: Adding a trend component increased the RMSE. This indicates that the trend parameter did not improve predictive accuracy for this specific dataset.
Conclusion: Per the principle of parsimony, the simpler ETS(A,N,N) model is preferred, as the extra complexity of the trended model provided no benefit in error reduction.
# Plot both forecasts
fit_both <- can_exports %>% model(ANN = ETS(Exports ~ error("A") + trend("N") + season("N")), AAN = ETS(Exports ~ error("A") + trend("A") + season("N")))
fit_both %>% forecast(h = 5) %>% autoplot(can_exports)
Method Selection: The ANN model is the preferred choice for this dataset. The AAN model’s trend projection is misleading because it incorrectly assumes the recent historical decline will continue linearly into the future.
Conclusion: Given that the AAN model had a higher RMSE (1.6256 vs 1.6177) and fails to capture the stabilization observed in the series, the simpler ANN model is superior for these data.
# Extract RMSE
rmse_nn <- accuracy(fit_nn)$RMSE
y_hat <- fc_nn$.mean[1]
# Manual 95% Interval (using RMSE as a proxy for sigma)
lower_manual <- y_hat - 1.96 * rmse_nn
upper_manual <- y_hat + 1.96 * rmse_nn
[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]
# 1. Filter data for China
china_gdp <- global_economy %>% filter(Country == "China")
# 2. Define models
fit_china <- china_gdp %>%
model(
Standard = ETS(GDP ~ error("A") + trend("A") + season("N")),
Damped = ETS(GDP ~ error("A") + trend("Ad") + season("N")),
BoxCox = ETS(box_cox(GDP, 0.3) ~ error("A") + trend("A") + season("N"))
)
# 3. Forecast with a large horizon (h = 20)
fc_china <- fit_china %>% forecast(h = 20)
# 4. Plot
fc_china %>% autoplot(china_gdp) + labs(title = "China GDP: ETS Model Comparison")
Analysis: China GDP Forecast Comparison Standard Trend (A,A,N): Projects historical growth linearly. It is often too aggressive for long-term horizons as it assumes growth rates remain constant.
Damped Trend (A,Ad,N): Acts as a “stabilizer.” By forcing the trend to flatten over time, it produces a more realistic, conservative forecast that avoids explosive, unrealistic growth.
Box-Cox Transformation: Adjusts for rapid acceleration in the GDP data. It stabilizes the variance, preventing the forecast intervals from becoming artificially narrow or wide due to the exponential nature of the series.
Intuition: Use Damped trends to avoid over-optimistic projections and Box-Cox transformations when the absolute size of the data changes the underlying volatility of the series.
# Filter data for Gas
gas_data <- aus_production %>% filter(!is.na(Gas))
# Fit models: Additive vs. Multiplicative seasonality and Damped trend
fit_gas <- gas_data %>%
model(
Multiplicative = ETS(Gas ~ error("M") + trend("A") + season("M")),
Damped = ETS(Gas ~ error("M") + trend("Ad") + season("M"))
)
# Forecast
fc_gas <- fit_gas %>% forecast(h = "5 years")
# Plot
fc_gas %>% autoplot(gas_data) + labs(title = "Gas Production Forecast")
Analysis: Gas Production Forecast
Seasonality: The use of multiplicative seasonality (season(“M”)) is necessary because the magnitude of the seasonal fluctuations in gas production increases proportionally as the overall production level grows over time. Damped Trend Experiment: Applying a damped trend (trend(“Ad”)) moderates the long-term growth projection. While a standard trend assumes growth continues at a constant rate, the damped model assumes the growth rate will eventually plateau.
Conclusion: Whether the damped trend improves the forecast depends on your expectations for the industry; if you anticipate that the recent historical growth will eventually slow down, the damped model provides a more realistic and conservative forecast compared to the linear trend.
Why is multiplicative seasonality necessary for this series? Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
Multiplicative seasonality is necessary because the seasonal peaks (Christmas/holiday spikes) grow in proportion to the total sales volume. An additive model would inaccurately keep the seasonal variance constant.
library(fpp3)
# Filter the series
retail_data <- aus_retail %>%
filter(State == "Victoria", Industry == "Clothing retailing")
# Fit models: Holt-Winters vs. Damped
fit <- retail_data %>%
model(
HW = ETS(Turnover ~ error("M") + trend("A") + season("M")),
Damped = ETS(Turnover ~ error("M") + trend("Ad") + season("M"))
)
Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer? Check that the residuals from the best method look like white noise.
# Compare RMSE
accuracy(fit) %>% select(.model, RMSE)
# Residual check for the best model
fit %>% select(HW) %>% gg_tsresiduals()
## Warning: `gg_tsresiduals()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_tsresiduals()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?
# Train to 2010
train <- retail_data %>% filter(Month <= yearmonth("2010 Dec"))
# Fit models and Seasonal Naïve
fit_test <- train %>%
model(
HW = ETS(Turnover ~ error("M") + trend("A") + season("M")),
SNAIVE = SNAIVE(Turnover)
)
# Forecast and compare to test data
fc <- fit_test %>% forecast(h = "5 years")
fc %>% accuracy(retail_data)
# Filter your retail series
retail_data <- aus_retail %>%
filter(State == "Victoria", Industry == "Clothing retailing")
# Apply STL decomposition + ETS on seasonally adjusted data
# We use decomposition_model() to handle the re-seasonalization
fit_stl <- retail_data %>%
model(
stl_ets = decomposition_model(
STL(box_cox(Turnover, 0.3) ~ trend(window = 21) + season(window = "periodic")),
ETS(season_adjust ~ error("A") + trend("A") + season("N"))
)
)
# Forecast and compare to test data
train <- retail_data %>% filter(Month <= yearmonth("2010 Dec"))
fit_test_stl <- train %>%
model(stl_ets = decomposition_model(
STL(box_cox(Turnover, 0.3) ~ trend(window = 21) + season(window = "periodic")),
ETS(season_adjust ~ error("A") + trend("A") + season("N"))
))
fc_stl <- fit_test_stl %>% forecast(h = "5 years")
fc_stl %>% accuracy(retail_data)
Analysis: STL Decomposition and ETS Methodology: By applying a Box-Cox transformation followed by STL decomposition, we effectively stabilized the variance and separated the seasonal component from the trend. This allows the ETS model to focus exclusively on the seasonally adjusted series .
Performance: With an RMSE of 72.27, we can compare this model directly against the previous Holt-Winters and Seasonal Naïve approaches.
Evaluation: The relatively high ACF1 value (0.775) indicates that significant information remains in the residuals . This suggests that while the STL+ETS approach is robust, it may not have outperformed your best previous model in this specific case. Further optimization of the trend window or seasonality parameters may be required to reduce the residual autocorrelation.