Homework5

Do exercises 8.1, 8.5, 8.6, 8.7, 8.8, 8.9 in Hyndman. Please submit both the link to your Rpubs and the .pdf file with your run code

Problem 8.1

Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

I will start by loading the dataset and selecting the relevant data for pigs in Victoria.

data("aus_livestock")

pigs_vic <- aus_livestock |>
  filter(Animal == "Pigs", State == "Victoria") 

Now I will plot the time series to visualize the data.

autoplot(pigs_vic) +
  labs(title = "Number of Pigs Slaughtered in Victoria",
       y = "Number of Pigs",
       x = "Year")

  1. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of
    α and ℓ0 , and generate forecasts for the next four months.
ets_model <- pigs_vic |>
  model(ETS = ETS(Count ~ error("A") + trend("N") + season("N")))

report(ets_model)
## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3221247 
## 
##   Initial states:
##      l[0]
##  100646.6
## 
##   sigma^2:  87480760
## 
##      AIC     AICc      BIC 
## 13737.10 13737.14 13750.07

The optimal value of α is 0.9999, and the initial level ℓ0 is 100646.6. The model is an ETS(A,N,N) model, which means it has additive errors, no trend, and no seasonality.

Now I will generate forecasts for the next four months.

ets_forecast <- ets_model |>
  forecast(h = "4 months")

autoplot(ets_forecast, pigs_vic) +
  labs(title = "Simple Exponential Smoothing Forecasts: Pigs in Victoria",
       y = "Count") 

This will show the historical data with point forecasts and 80%/95% prediction intervals for the next four months.

  1. Compute a 95% prediction interval for the first forecast using y ±1.96s where is the standard deviation of the residuals. Compare your interval with the interval produced by R.

In this case I will start by extracting the first forecasted value and its prediction interval from the ets_forecast object.

fc_ets_first <- ets_forecast |>
  slice(1) |>
  hilo(95) |>
  unpack_hilo(`95%`) |>
  select(Month, ets_mean = .mean, ets_lo = `95%_lower`, ets_hi = `95%_upper`)

print(fc_ets_first)
## # A tsibble: 1 x 4 [?]
##      Month ets_mean ets_lo  ets_hi
##      <mth>    <dbl>  <dbl>   <dbl>
## 1 2019 Jan   95187. 76855. 113518.

Now I also need to calculate the standard deviation of the residuals from the ETS model.

residuals_ets <- residuals(ets_model)
s_ets <- sd(residuals_ets$.resid, na.rm = TRUE)

Both values are now available. I can now compute the 95% prediction interval using the formula y ± 1.96s.

y_ets <- fc_ets_first$ets_mean
lower_ets <- y_ets - 1.96 * s_ets
upper_ets <- y_ets + 1.96 * s_ets
print(c(lower_ets, upper_ets))
## [1]  76871.01 113502.10

Based on the calculation, the 95% prediction interval for the first month’s forecast is approximately (76871.01, 113502.10).

Problem 8.5

Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

  1. Plot the Exports series and discuss the main features of the data.
data("global_economy")

us_exports <- global_economy |>
  filter(Country == "United States")

autoplot(us_exports, Exports) +
  labs(title = "Annual Exports of the United States",
       y = "Exports",
       x = "Year")

The plot shows a general upward trend in the exports of the United States over the years, with some fluctuations. There are noticeable peaks and troughs, corresponding possible to periods of economic growth and recession.

  1. Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

In order to be able to run ETS model, it’s important to first remove missing values

us_exports <- us_exports |>
  filter(!is.na(Exports))

fit_ses <- us_exports |>
  model(ETS(Exports ~ error("A") + trend("N") + season("N")))

report(fit_ses)
## Series: Exports 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.9998995 
## 
##   Initial states:
##     l[0]
##  4.76618
## 
##   sigma^2:  0.4075
## 
##      AIC     AICc      BIC 
## 183.2500 183.7028 189.3791

Now I will generate forecasts for the next 10 years and plot it.

fc_ses <- fit_ses |> forecast(h = 10)

autoplot(fc_ses, us_exports) +
  labs(
    title = "ETS(A,N,N) Forecasts of U.S. Exports (% of GDP)",
    y = "Exports (% of GDP)"
  )

The ETS(A,N,N) model produces a constant mean forecast for U.S. exports as a percentage of GDP. The prediction intervals widen over the forecast horizon, reflecting increasing uncertainty.

  1. Compute the RMSE values for the training data.
accuracy(fit_ses)
## # A tibble: 1 × 11
##   Country       .model     .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <fct>         <chr>      <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 United States "ETS(Expo… Trai… 0.125 0.627 0.465  1.37  5.10 0.990 0.992 0.239

The RMSE value for the training data is 0.6270672. This indicates that, on average, the forecasts from the ETS(A,N,N) model deviate from the actual values by about 0.627 percentage points.

  1. Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

I will fit an ETS(A,A,N) model to the data and generate forecasts with Holt’s linear trend method so I can compare the results.

fit_holt <- us_exports |>
  model(ETS(Exports ~ error("A") + trend("A") + season("N")))


report(fit_ses)
## Series: Exports 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.9998995 
## 
##   Initial states:
##     l[0]
##  4.76618
## 
##   sigma^2:  0.4075
## 
##      AIC     AICc      BIC 
## 183.2500 183.7028 189.3791
report(fit_holt)
## Series: Exports 
## Model: ETS(A,A,N) 
##   Smoothing parameters:
##     alpha = 0.9999 
##     beta  = 0.0001088322 
## 
##   Initial states:
##      l[0]      b[0]
##  4.669623 0.1158518
## 
##   sigma^2:  0.4067
## 
##      AIC     AICc      BIC 
## 185.0267 186.2032 195.2420

Both the simple exponential smoothing model (ETS(A,N,N)) and the additive-trend model (ETS(A,A,N)) fit the U.S. Exports (% of GDP) data almost equally well.

The smoothing parameter α ≈ 1 indicates that recent values dominate the forecasts. The trend smoothing parameter β in the trended model is nearly zero, showing that the trend component contributes little additional explanatory power.The AICc and BIC values are slightly lower for the simpler ETS(A,N,N), it is the preferred model.

  1. Compare the forecasts from both methods. Which do you think is best?

In this part I will create a new dataframe that contains the forecasts from both models, along with their 95% prediction intervals. I will then plot the historical data, the forecast means, and the prediction intervals for both models on the same graph for comparison.

fc_all <- bind_rows(
  forecast(fit_ses,  h = 10) |> mutate(Model = "ETS(A,N,N)"),
  forecast(fit_holt, h = 10) |> mutate(Model = "ETS(A,A,N)")
)

fc_all_i <- fc_all |>
  hilo(95) |>
  unpack_hilo(`95%`)        

ggplot() +
  geom_line(data = us_exports, aes(Year, Exports), linewidth = 0.6) +
  geom_ribbon(
    data = fc_all_i |> filter(Model == "ETS(A,N,N)"),
    aes(x = Year, ymin = `95%_lower`, ymax = `95%_upper`, fill = "ETS(A,N,N)"),
    alpha = 0.18
  ) +
  geom_ribbon(
    data = fc_all_i |> filter(Model == "ETS(A,A,N)"),
    aes(x = Year, ymin = `95%_lower`, ymax = `95%_upper`, fill = "ETS(A,A,N)"),
    alpha = 0.18
  ) +
  geom_line(
    data = fc_all_i |> filter(Model == "ETS(A,N,N)"),
    aes(Year, .mean, colour = "ETS(A,N,N)"), linewidth = 0.9
  ) +
  geom_line(
    data = fc_all_i |> filter(Model == "ETS(A,A,N)"),
    aes(Year, .mean, colour = "ETS(A,A,N)"), linewidth = 0.9
  ) +
  scale_colour_manual(name = "Model",
                      values = c("ETS(A,N,N)" = "blue", "ETS(A,A,N)" = "red")) +
  scale_fill_manual(name = "Model",
                    values = c("ETS(A,N,N)" = "blue", "ETS(A,A,N)" = "red")) +
  labs(title = "Comparison of ETS(A,N,N) vs ETS(A,A,N) Forecasts — U.S. Exports (% of GDP)",
       y = "Exports (% of GDP)", x = "Year") +
  theme_minimal()

The forecasts from both models are quite similar, with each projecting that U.S. exports as a percentage of GDP will remain near current levels, showing only a gradual increase over the next decade. Given the nearly identical forecasts and comparable accuracy, the simpler ETS(A,N,N) model is preferred .

  1. Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

In order to avoid the issue while comparing intervals from different models, I will first extract the first forecast and its prediction interval from each model and tidily combine them into a single dataframe. I will also calculate the RMSE for each model and use it to compute manual 95% prediction intervals for the first forecast. Finally, I will compare the R-generated intervals with the manually calculated ones.

library(dplyr)
library(tidyr)

fc_ses_first  <- forecast(fit_ses,  h = 1) |>
  hilo(95) |> unpack_hilo(`95%`) |>
  transmute(Year = as.numeric(Year), 
            model = "ETS(A,N,N)",
            mean = .mean,
            r_lo = `95%_lower`, r_hi = `95%_upper`) |>
  tibble::as_tibble()

fc_holt_first <- forecast(fit_holt, h = 1) |>
  hilo(95) |> unpack_hilo(`95%`) |>
  transmute(Year = as.numeric(Year),
            model = "ETS(A,A,N)",
            mean = .mean,
            r_lo = `95%_lower`, r_hi = `95%_upper`) |>
  tibble::as_tibble()

rmse_ses  <- accuracy(fit_ses)  |> pull(RMSE)
rmse_holt <- accuracy(fit_holt) |> pull(RMSE)

man_ses  <- tibble(model="ETS(A,N,N)", mean = fc_ses_first$mean,
                   m_lo = mean - 1.96*rmse_ses,  m_hi = mean + 1.96*rmse_ses)
man_holt <- tibble(model="ETS(A,A,N)", mean = fc_holt_first$mean,
                   m_lo = mean - 1.96*rmse_holt, m_hi = mean + 1.96*rmse_holt)

bind_rows(fc_ses_first, fc_holt_first) |>
  left_join(bind_rows(man_ses, man_holt), by = c("model","mean")) |>
  mutate(diff_lo = m_lo - r_lo,
         diff_hi = m_hi - r_hi) |>
  print()
## # A tibble: 2 × 9
##    Year model       mean  r_lo  r_hi  m_lo  m_hi diff_lo diff_hi
##   <dbl> <chr>      <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>
## 1  2017 ETS(A,N,N)  11.9  10.6  13.1  10.7  13.1  0.0221 -0.0221
## 2  2017 ETS(A,A,N)  12.0  10.8  13.3  10.8  13.2  0.0446 -0.0446

The comparison shows that the manually calculated 95% prediction intervals using the RMSE values are very close to those produced by R for both models. The differences in the lower and upper bounds are minimal, indicating that both methods yield consistent results for the first forecast’s prediction interval.

Problem 8.6

Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]

data("global_economy")

china_gdp <- global_economy |>
  filter(Country == "China") |>
  filter(!is.na(GDP))

autoplot(china_gdp, GDP) +
  labs(title = "Annual GDP of China",
       y = "GDP",
       x = "Year")

The plot shows a strong upward trend in China’s GDP over the years, with rapid growth especially especially after 1990.

fit_ets <- china_gdp |>
  model(ETS(GDP))
report(fit_ets)
## Series: GDP 
## Model: ETS(M,A,N) 
##   Smoothing parameters:
##     alpha = 0.9998998 
##     beta  = 0.3119984 
## 
##   Initial states:
##         l[0]       b[0]
##  45713434615 3288256682
## 
##   sigma^2:  0.0108
## 
##      AIC     AICc      BIC 
## 3102.064 3103.218 3112.366

The ETS(M,A,N) model fits the data well, with a high smoothing parameter α = 0.9999 indicating that recent values heavily influence the forecasts. The additive trend component (β = 0.3119984) captures the overall upward trend in GDP.

ETS with a damped trend

fit_damped <- china_gdp |>
  model(ETS(GDP ~ error("A") + trend("Ad") + season("N")))
report(fit_damped)
## Series: GDP 
## Model: ETS(A,Ad,N) 
##   Smoothing parameters:
##     alpha = 0.9998747 
##     beta  = 0.5634078 
##     phi   = 0.9799999 
## 
##   Initial states:
##         l[0]       b[0]
##  50284778075 3288256683
## 
##   sigma^2:  3.959258e+22
## 
##      AIC     AICc      BIC 
## 3260.187 3261.834 3272.549
fc_damped <- fit_damped |> forecast(h = 10)

autoplot(fc_damped, china_gdp) +
  labs(title = "China GDP Forecast Damped Trend")

Forecasts still rise with a damped trend, but at a decreasing rate over time.

ETS with Box-Cox transformation

fit_boxcox <- china_gdp |>
  model(ETS(box_cox(GDP, lambda = 0)))

report(fit_boxcox)
## Series: GDP 
## Model: ETS(A,A,N) 
## Transformation: box_cox(GDP, lambda = 0) 
##   Smoothing parameters:
##     alpha = 0.9999 
##     beta  = 0.1079782 
## 
##   Initial states:
##      l[0]       b[0]
##  24.77007 0.04320226
## 
##   sigma^2:  0.0088
## 
##       AIC      AICc       BIC 
## -33.07985 -31.92600 -22.77763
fc_boxcox <- fit_boxcox |> forecast(h = 10)

autoplot(fc_boxcox, china_gdp) +
  labs(title = "China GDP Forecast | Box Cox ")

The Box-Cox transformation (log transform) stabilizes the variance and makes the series more linear.

The various ETS models highlight how assumptions about trend and variability influence the forecasts. Among them, the damped trend ETS (M,Ad,N) offers the most realistic view of China’s GDP suggesting that while growth will likely continue, it is expected to slow gradually over time. Applying a Box Cox transformation helps stabilize the changing variance, resulting in smoother forecasts and more reliable prediction intervals.

Problem 8.7

Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

We will start by loading the dataset and selecting Gas production column.

data("aus_production")

gas_data <- aus_production |>
  select(Quarter, Gas)

autoplot(gas_data, Gas) +
  labs(title = "Australian Gas Production (Quarterly)",
       y = "Millions of cubic meters",
       x = "Year")

In this plot, we can see that the gas production series increases steadily over time and shows clear seasonal peaks every year, with the size of those peaks growing as production increases. So, multiplicative seasonality makes the model scale the seasonal amplitude up as production grows exactly what we see in this plot.

Now I will fit an ETS model, using the automatic selection to find the best model.

fit_gas <- gas_data |> model(ETS(Gas))
report(fit_gas)
## Series: Gas 
## Model: ETS(M,A,M) 
##   Smoothing parameters:
##     alpha = 0.6528545 
##     beta  = 0.1441675 
##     gamma = 0.09784922 
## 
##   Initial states:
##      l[0]       b[0]      s[0]    s[-1]    s[-2]     s[-3]
##  5.945592 0.07062881 0.9309236 1.177883 1.074851 0.8163427
## 
##   sigma^2:  0.0032
## 
##      AIC     AICc      BIC 
## 1680.929 1681.794 1711.389

The selected model is ETS(M,A,M), which includes multiplicative errors, an additive trend, and multiplicative seasonality.

Now I will generate forecasts for the next 3 years (12 quarters) and plot it.

fc_gas <- fit_gas |> forecast(h = "3 years")

autoplot(fc_gas, gas_data) +
  labs(title = "ETS Forecasts for Australian Gas Production",
       y = "Millions of cubic meters")

The ETS(M,A,M) model produces forecasts that continue the upward trend and seasonal pattern observed in the historical data.

Now, I will fit an ETS model with a damped trend to see if it improves the forecasts.

fit_gas_damped <- gas_data |>
  model(ETS(Gas ~ error("M") + trend("Ad") + season("M")))

report(fit_gas_damped)
## Series: Gas 
## Model: ETS(M,Ad,M) 
##   Smoothing parameters:
##     alpha = 0.6489044 
##     beta  = 0.1551275 
##     gamma = 0.09369372 
##     phi   = 0.98 
## 
##   Initial states:
##      l[0]       b[0]      s[0]    s[-1]   s[-2]     s[-3]
##  5.858941 0.09944006 0.9281912 1.177903 1.07678 0.8171255
## 
##   sigma^2:  0.0033
## 
##      AIC     AICc      BIC 
## 1684.028 1685.091 1717.873
fc_gas_damped <- fit_gas_damped |> forecast(h = "3 years")

autoplot(gas_data, Gas) +
  autolayer(fc_gas, colour = "blue") +
  autolayer(fc_gas_damped, colour = "red") +
  labs(title = "Comparison of ETS(M,A,M) vs ETS(M,Ad,M) Forecasts",
       y = "Millions of cubic meters",
       x = "Year") +
  scale_colour_manual(values = c("blue", "red"),
                      labels = c("Non-damped", "Damped trend"))

The damped trend model (ETS(M,Ad,M)) produces forecasts that still show an upward trend and seasonal pattern, but the growth rate slows over time.

glance(fit_gas)
## # A tibble: 1 × 9
##   .model    sigma2 log_lik   AIC  AICc   BIC   MSE  AMSE    MAE
##   <chr>      <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 ETS(Gas) 0.00324   -831. 1681. 1682. 1711.  21.1  32.2 0.0413
glance(fit_gas_damped)
## # A tibble: 1 × 9
##   .model                     sigma2 log_lik   AIC  AICc   BIC   MSE  AMSE    MAE
##   <chr>                       <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 "ETS(Gas ~ error(\"M\") … 0.00329   -832. 1684. 1685. 1718.  21.1  32.0 0.0417

The damped trend model (ETS(M,Ad,M)) shows slightly lower AICc and BIC values, suggesting a better overall fit. Its RMSE is also marginally lower, indicating a small improvement in forecast accuracy compared to the other model.

Problem 8.8

Recall your retail time series data (from Exercise 7 in Section 2.10).

  1. Why is multiplicative seasonality necessary for this series?

Multiplicative seasonality is necessary because the size of the seasonal fluctuations increases as the overall level of retail sales rises.

  1. Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
coffee_retail <- aus_retail |>
  filter(Industry == "Cafes, restaurants and takeaway food services",
         State == "New South Wales")

autoplot(coffee_retail, Turnover) +
  labs(title = "Monthly Retail Turnover: Cafes, Restaurants & Takeaway (NSW)",
       y = "Turnover ($ millions)",
       x = "Year")

In this plot, we can see a clear upward trend in retail turnover over time, along with strong seasonal patterns that repeat annually.

Now I will fit both the standard Holt Winters multiplicative model and the damped trend version. After that I will generate forecasts for the next 3 years and plot them for comparison.

fit_hw_coffee <- coffee_retail |>
  model(ETS(Turnover ~ error("M") + trend("A") + season("M")))
report(fit_hw_coffee)
## Series: Turnover 
## Model: ETS(M,A,M) 
##   Smoothing parameters:
##     alpha = 0.7729142 
##     beta  = 0.004491741 
##     gamma = 0.0001004778 
## 
##   Initial states:
##      l[0]      b[0]     s[0]     s[-1]    s[-2]    s[-3]    s[-4]   s[-5]
##  142.3738 0.4409842 0.994291 0.9301837 1.022985 1.140497 1.023531 1.01942
##     s[-6]     s[-7]     s[-8]    s[-9]    s[-10]    s[-11]
##  0.987387 0.9847622 0.9841327 0.944313 0.9872559 0.9812413
## 
##   sigma^2:  0.0015
## 
##      AIC     AICc      BIC 
## 5253.739 5255.186 5323.253
fit_hw_coffee_damped <- coffee_retail |>
  model(ETS(Turnover ~ error("M") + trend("Ad") + season("M")))
report(fit_hw_coffee_damped)
## Series: Turnover 
## Model: ETS(M,Ad,M) 
##   Smoothing parameters:
##     alpha = 0.7681617 
##     beta  = 0.009440331 
##     gamma = 0.0001011594 
##     phi   = 0.9799997 
## 
##   Initial states:
##      l[0]     b[0]      s[0]     s[-1]   s[-2]    s[-3]    s[-4]    s[-5]
##  142.2247 1.499298 0.9910721 0.9273323 1.02064 1.138143 1.022578 1.022682
##      s[-6]     s[-7]     s[-8]     s[-9]    s[-10]    s[-11]
##  0.9925157 0.9886548 0.9869216 0.9442324 0.9864236 0.9788046
## 
##   sigma^2:  0.0015
## 
##      AIC     AICc      BIC 
## 5259.854 5261.475 5333.457
fc_hw  <- forecast(fit_hw_coffee,         h = "3 years")
fc_dmp <- forecast(fit_hw_coffee_damped,  h = "3 years")

autoplot(coffee_retail, Turnover) +
  autolayer(fc_hw,  colour = "blue") +
  autolayer(fc_dmp, colour = "red") +
  labs(title = "Coffee retail: Holt Winters multiplicative vs damped",
       y = "Turnover ($ millions)", x = "Year")

The damped trend model still shows growth in coffee shop sales and keeps the same seasonal pattern, but the upward trend becomes less steep over time. In other words, sales are expected to keep increasing, just at a slower pace.

  1. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
accuracy(fit_hw_coffee)
## # A tibble: 1 × 12
##   State   Industry .model .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1
##   <chr>   <chr>    <chr>  <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 New So… Cafes, … "ETS(… Trai…  1.49  19.5  14.0 0.213  2.74 0.278 0.300 0.0527
accuracy(fit_hw_coffee_damped)
## # A tibble: 1 × 12
##   State   Industry .model .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1
##   <chr>   <chr>    <chr>  <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 New So… Cafes, … "ETS(… Trai…  2.22  19.4  13.9 0.298  2.76 0.278 0.299 0.0439

When comparing the one-step forecast errors, both models perform very similarly, but the damped trend model has a slightly lower RMSE (19.45 compared to 19.50). This small improvement suggests that the damped model fits the data a bit better and predicts short term changes more accurately. More importantly, its forecasts look more realistic because they assume that the strong upward trend in coffee sales will eventually slow down rather than continue indefinitely.

  1. Check that the residuals from the best method look like white noise.
fit_hw_coffee_damped |> gg_tsresiduals()

augment(fit_hw_coffee_damped) |>
  features(.innov, ljung_box, lag = 24, dof = 5)
## # A tibble: 1 × 5
##   State           Industry                              .model lb_stat lb_pvalue
##   <chr>           <chr>                                 <chr>    <dbl>     <dbl>
## 1 New South Wales Cafes, restaurants and takeaway food… "ETS(…    34.9    0.0145

The residual plots for the damped trend model (ETS(M,Ad,M)) show no strong patterns or major outliers, and most residuals are centered around zero with constant variance. However, the ACF plot and the Ljung–Box test (p = 0.0145) suggest that a small amount of autocorrelation remains, particularly around the seasonal lags.

Overall, the residuals behave mostly like white noise, meaning the model fits well, but it doesn’t capture every minor fluctuation in the coffee retail data. For most practical purposes, this model is still reliable, though a slightly more complex model (for example, a seasonal ARIMA) could potentially reduce the remaining autocorrelation.

  1. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?
train_coffee <- coffee_retail |> filter(year(Month) <= 2010)
test_coffee  <- coffee_retail |> filter(year(Month) > 2010)

fit_train <- train_coffee |>
  model(ETS(Turnover ~ error("M") + trend("Ad") + season("M")))

fc_test <- fit_train |> forecast(h = nrow(test_coffee))


accuracy(fc_test, test_coffee)
## # A tibble: 1 × 12
##   .model    State Industry .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>     <chr> <chr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(Tur… New … Cafes, … Test   174.  240.  191.  13.8  16.0   NaN   NaN 0.961
autoplot(fc_test, train_coffee) +
  autolayer(test_coffee, Turnover, colour = "red") +
  labs(title = "Coffee Retail: Damped ETS Forecasts vs Actuals (Test Set)",
       y = "Turnover ($ millions)", x = "Year")

After training the damped trend ETS model ETS(M,Ad,M) on data up to 2010 and testing it on later years, the model achieved a test RMSE of about 239.7 and a MAPE of around 16%. The forecast plot shows that the model captures the general seasonal pattern of coffee retail sales but tends to under-predict the rapid growth observed after 2010.

fit_snaive <- train_coffee |>
  model(SNAIVE(Turnover ~ lag("year"))) 

fc_snaive <- fit_snaive |> forecast(h = nrow(test_coffee))


acc_ets <- accuracy(fc_test, test_coffee) |> mutate(Model = "ETS(M,Ad,M)")
acc_snaive <- accuracy(fc_snaive, test_coffee) |> mutate(Model = "Seasonal Naïve")


comparison <- bind_rows(
  acc_ets |> select(Model, RMSE, MAE, MAPE, ACF1),
  acc_snaive |> select(Model, RMSE, MAE, MAPE, ACF1)
)

print(comparison)
## # A tibble: 2 × 5
##   Model           RMSE   MAE  MAPE  ACF1
##   <chr>          <dbl> <dbl> <dbl> <dbl>
## 1 ETS(M,Ad,M)     240.  191.  16.0 0.961
## 2 Seasonal Naïve  286.  232.  19.5 0.965

The damped trend model ETS(M,Ad,M) clearly outperforms the seasonal naïve approach. It has a lower RMSE (239.7 vs 286.2), lower MAE, and lower MAPE, which means its forecasts are more accurate overall. Both models show similar residual autocorrelation (ACF₁ ≈ 0.96), but the ETS model captures the underlying growth and seasonal behavior of coffee retail turnover much better.

Problem 8.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

We are going to use the same training and test sets from the previous problem. The STL decomposition will help us separate the seasonal component from the trend and remainder, allowing us to model the seasonally adjusted data with ETS.

library(fpp3)
library(feasts)

lambda <- feasts::guerrero(train_coffee$Turnover)

fit_stl <- train_coffee |>
  model(
    decomposition_model(
      STL(box_cox(Turnover, lambda) ~ season(window = "periodic")),
      ETS(season_adjust ~ error("M") + trend("A") + season("N"))
    )
  )

fc_stl <- fit_stl |> forecast(h = nrow(test_coffee))
accuracy(fc_stl, test_coffee)
## # A tibble: 1 × 12
##   .model    State Industry .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>     <chr> <chr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "decompo… New … Cafes, … Test  -87.6  97.5  87.6 -8.33  8.33   NaN   NaN 0.784

Applying an STL decomposition to the Box–Cox transformed coffee retail series, followed by an ETS model on the seasonally adjusted data, produced a substantial improvement in forecast accuracy compared to the earlier damped ETS(M,Ad,M) model. The test RMSE decreased from 239.7 to 97.5, and the MAPE nearly halved from 16% to 8.3%. This indicates that the STL+ETS approach captures the underlying patterns in the data much more effectively, leading to significantly better forecasts.