8.1: Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

a: Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and ℓ0, and generate forecasts for the next four months.

fit <- aus_livestock %>%
  filter(State == "Victoria", Animal == "Pigs") %>%
  model(ETS(Count ~ error("A") + trend("N") + season("N")))

fit %>% report()

## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3221247 
## 
##   Initial states:
##      l[0]
##  100646.6
## 
##   sigma^2:  87480760
## 
##      AIC     AICc      BIC 
## 13737.10 13737.14 13750.07

fc <- fit %>%
  forecast(h = 4)

fc %>%
  autoplot(aus_livestock) +
  ggtitle("Num of Pigs Slaughtered in Victoria")

Compute a 95% prediction interval for the first forecast using y^±1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

s <- residuals(fit)$.resid  %>%
  sd()
y_hat <- fc$.mean[1]

#Computing y_hat ± 1.96s
lower_limit <- y_hat - 1.96*s
upper_limit <- 1.96*s + y_hat
cat("confidence interval [",lower_limit, ", ", upper_limit,"]")

## confidence interval [ 76871.01 ,  113502.1 ]

fc  |>
  hilo(95)  |>
  pull('95%')  |>
  head(1)

## <hilo[1]>
## [1] [76854.79, 113518.3]95

Compared to the 95% prediction interval produced by R, my first forecast using y^±1.96s where s is the standard deviation of the residuals differs slightly. As the one that is produced with R has a higher lower limit and a smaller upper limit.

8.5

Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

a: Plot the Exports series and discuss the main features of the data

data("global_economy")

portugal_eco <- global_economy %>%
  filter(Country == "Portugal") %>%
  select(Year, Exports) 

autoplot(portugal_eco, Exports) +
  labs(
    title = "Portugal Exports",
    y = "Exports (% of GDP)",
    x = "Year"
  ) +
  theme(
    plot.title = element_text(hjust = 0.5, face = "bold")
  )

According to the plot of the times series, there is a clear long-term upward trend. Since the data is annual, there is no seasonality. There are clear rises and falls around major economic periods, such as the 2008 financial crisis and the COVID-19 pandemic.

b: Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

fit_ses_method <- portugal_eco %>%
  model(ETS(Exports ~ error("A") + trend("N") + season("N")))

fc_ses_method <- forecast(fit_ses_method, h = "10 years")

autoplot(fc_ses_method, portugal_eco) +
  labs(
    title = "Portugal Exports - SES Method",
    y = "Exports (% of GDP)",
    x = "Year"
  ) +
  theme(
    plot.title = element_text(hjust = 0.5, face = "bold")
  )

c: Compute the RMSE values for the training data.

ses_meth_accuracy <- accuracy(fit_ses_method)

ses_meth_accuracy$RMSE

## [1] 2.127608

D: Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set

From the ETS(A,N,N) (SES) model, we can see that when compared to the ETS(A,A,N) (Trend) model, it has a higher RMSE than the trend model. Interestingly, the results of the ETS SES model is the same for the other measures (ME, MAE, MPE, etc…). In this regard, the TREND model provides a more accuracy and represents Portugal’s exports more than the SES model.

fit <- portugal_eco %>%
  model(
    SES   = ETS(Exports ~ error("A") + trend("N") + season("N")),  # ETS(A,N,N)
    Trend = ETS(Exports ~ error("A") + trend("A") + season("N"))   # ETS(A,A,N)
  )

fc <- forecast(fit, h = "10 years")

autoplot(portugal_eco, Exports) +
  autolayer(fc, level = 95, aes(colour = .model)) +
  labs(
    title = "ETS(A,N,N) SES vs ETS(A,A,N) TREND Forecasts; Portugal Exports",
    y = "% of GDP",
    x = "Year"
  ) +
  theme(
    plot.title = element_text(hjust = 0.5, face = "bold")
  )

accuracy(fit)

E: Compare the forecasts from both methods. Which do you think is best?

The ETS(A,N,N) model assumes no trend or seasonality, making it unsuitable for Portugal’s export data, which shows a clear upward increase. In contrast, the ETS(A,A,N) model incorporates an additive trend component, allowing forecasts to align with recent growth patterns and better capture long-term dynamics of Portugal’s exports.

F: Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

From calculating the 95% prediction interval for the first forecast for each model, we can see that the manual predictions are bounded more sightly tighter than the R predictions. Manual SES prediction (38.94278, 47.28300) V.S R Produced SES predction (38.86904, 47.35673). Manual Trend prediction (39.55219, 47.65939) V.S R Produced Trend prediction (39.40481, 47.80676).

frst_fc <- fc %>%
  as_tibble() %>%
  group_by(.model) %>%
  slice(1) %>%
  select(.model, .mean) %>%
  pivot_wider(names_from = .model, values_from = .mean)

man_pi <- tibble(
  Method = "Manual",
  Model = c("SES", "Trend"),
  Mean = c(frst_fc$SES, frst_fc$Trend),
  Lower = Mean - (1.96 * c(2.127608 , 2.068163  )),
  Upper = Mean + (1.96 * c(2.127608 , 2.068163  ))
)

r_pi <- fc %>%
  hilo(95) %>%
  unpack_hilo(`95%`) %>%
  as_tibble() %>%
  group_by(.model) %>%
  slice(1) %>%
  transmute(
    Method = "R",
    Model = .model,
    Lower = `95%_lower`,
    Mean = .mean,
    Upper = `95%_upper`
  )

bind_rows(man_pi, r_pi)

8.6

Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]

China’s GDP shows an upward trend mainly after 1990 without any seasonality. So, using additive errors and “N” for seasonality would be appropriate.

global_economy  %>%
  filter(Country == "China")  %>%
  autoplot(GDP) +
  labs(y=" GDP", title="GDP: China")

lambda <- global_economy %>%
  filter(Country == "China") %>%
  features(GDP, guerrero) %>%
  pull(lambda_guerrero)

choice_models <- list(
  Damped = ETS(GDP ~ error("A") + trend("Ad") + season("N")),
  SES = ETS(GDP ~ error("A") + trend("N")  + season("N")), 
  Trend = ETS(GDP ~ error("A") + trend("A")  + season("N")), 
Box_Cox_Trend = ETS(box_cox(GDP,lambda) ~ error("A") + trend("A") + season("N")),
  Box_Cox_Damped = ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad", phi = 0.75) + season("N"))
)

global_economy %>%
  filter(Country == "China") %>%
  model(!!!choice_models) %>%
  forecast(h = 100) %>%
  autoplot(global_economy %>% filter(Country == "China"), level = NULL) +
  facet_wrap(~ .model, scales = "free_y") +
  labs(
    title = "China GDP - ETS Variants",
    x = "Year (100 forecasted)",
    y = "GDP ($USD)"
  ) +
  theme(
    plot.title = element_text(hjust = 0.5, face = 'bold'),
    strip.text = element_text(face = "bold", size = 10)
  ) +
  guides(color = "none")

We can see that the SES model flat lines due to no trend or transformation after the last observation takes place, failing to capture the series’ upward trend. While the trend model does reflect the historical growth of China’s GDP better, but produces rapid, potentially exaggerated forecasts with a high uncertainty. This is similar to the trend and damped box-cox versions of the trend model. The damped model incorporates a damping parameter that reduces the slope over time, which allows the trend to rise before curving.

8.7

Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

Multiplicative seasonality is necessary here it is allowing the seasonality to be proportional to the level. While additive seasonality assumes a consatnt seasional size and underestimates the growth. The multiplicative models outperfrom the additive one in all metrics (RMSE, MAE, AND MAPE decreases notably), while MPE inches closer to zero, indicating lower bias and more accurate forecasts at higher levels. Between multiplicative models, the damped version improves upon the standard: MAE drops from 2.844 to 2.815, MASE from 0.510 to 0.505, and MAPE from 5.033 to 4.110—indicating the damped model tracks the actual trajectory more closely. So multiplicative seasonality does improve the forecasts.

gas_data <- aus_production %>%
  select(Gas)

gas_fitting <- gas_data %>%
  model(
    additive = ETS(Gas ~ error("A") + trend("A") + season("A")),
    multiplicative = ETS(Gas ~ error("A") + trend("A") + season("M")),
    multiplicative_damped = ETS(Gas ~ error("A") + trend("Ad") + season("M"))
  )

gas_forecasts <- forecast(gas_fitting, h = "20 years")

gas_forecasts %>%
  filter(.model == "additive") %>%
  autoplot(gas_data, level = NULL) +
  labs(
    title = "Additive ETS Model (A,A,A)",
    subtitle = "Australian Gas Production",
    y = "Gas Production"
  ) +
  theme(plot.title = element_text(hjust = 0.5, face = "bold"))

gas_forecasts %>%
  filter(.model == "multiplicative") %>%
  autoplot(gas_data, level = NULL) +
  labs(
    title = "Multiplicative ETS Model (A,A,M)",
    subtitle = "Australian Gas Production",
    y = "Gas Production"
  ) +
  theme(plot.title = element_text(hjust = 0.5, face = "bold"))

gas_forecasts %>%
  filter(.model == "multiplicative_damped") %>%
  autoplot(gas_data, level = NULL) +
  labs(
    title = "Multiplicative Damped ETS Model (A,Ad,M)",
    subtitle = "Australian Gas Production",
    y = "Gas Production"
  ) +
  theme(plot.title = element_text(hjust = 0.5, face = "bold"))

accuracy(gas_fitting)

8.8

Recall your retail time series data (from Exercise 7 in Section 2.10).

a: Why is multiplicative seasonality necessary for this series?

Since the seasonal fluctuations expand as the series level rises, multiplicative seasonality is neccessary here to accurately model this proportional relationship. Additive seasonality assumes constant amplitude and would understate the variation during peak periods.

set.seed(12345678)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))
set.seed(12345678)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`, 1))

# Plot the time series
autoplot(myseries, Turnover) +
  labs(
    title = "Aus Retail Turnover",
    x = "Month",
    y = "Turnover ($AUD)"
  ) +
  theme(
    plot.title = element_text(hjust = 0.5, face = 'bold'),
    plot.subtitle = element_text(hjust = 0.5, size = 10)
  )

b: Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

set.seed(1234)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`, 1))

#fitting models holt winter and damped holt winters
fitted_models <- myseries %>%
  model(
    `Holt-Winters (M,A,M)` = ETS(Turnover ~ error("M") + trend("A") + season("M")),
    `Damped Holt-Winters (M,Ad,M)` = ETS(Turnover ~ error("M") + trend("Ad") + season("M"))
  )


forecasts <- fitted_models %>%
  forecast(h = 36)

forecasts %>%
  autoplot(myseries, level = FALSE) +
  labs(
    title = "Aus Retail",
    x = "Month",
    y = "Turnover ($AUD)",
    color = "Model"
  ) +
  theme_classic() +
  theme(
    plot.title = element_text(hjust = 0.5, face = "bold"),
    plot.subtitle = element_text(hjust = 0.5),
    legend.position = "top"
  )

## Warning: Plot argument `level` should be a numeric vector of levels to display.
## Setting `level = NULL` will remove the intervals from the plot.

c: Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

The non-damped model contains the lower RSME, indicating a better short-term forecast accuracy. So I would prefer using that one over the damped one in this case.

accuracy(fitted_models)

d: Check that the residuals from the best method look like white noise.

# non-damped model
fit_hw_best <- fitted_models %>%
  select(`Holt-Winters (M,A,M)`)

fit_hw_best %>%
  gg_tsresiduals()

## Warning: `gg_tsresiduals()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_tsresiduals()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

Over time: Residuals fluctuate randomly around zero with no trend or seasonality
ACF: Most autocorrelations fall within confidence limits
Histogram looks close to normal

Residuals behave as white noise, confirming the model adequately captures the data’s underlying structure.

e: Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

Yes, the Holt-Winters multiplicative model significantly beats the seasonal naïve approach, with a test RMSE of 3.99 compared to 9.13. The substantial improvement demonstrates the value of capturing trend and multiplicative seasonality in the data.

train <- myseries %>% filter(Month <= yearmonth("2010 Dec"))
test <- myseries %>% filter(Month > yearmonth("2010 Dec"))

results <- bind_rows(
  # Holt-Winters
  train %>%
    model(HW = ETS(Turnover ~ error("M") + trend("A") + season("M"))) %>%
    forecast(new_data = test) %>%
    accuracy(test) %>%
    mutate(Model = "Holt-Winters"),
  
  train %>%
    model(SNaive = SNAIVE(Turnover)) %>%
    forecast(new_data = test) %>%
    accuracy(test) %>%
    mutate(Model = "Seasonal Naïve")
) %>%
  select(Model, RMSE, MAE, MAPE)

print(results)

## # A tibble: 2 × 4
##   Model           RMSE   MAE  MAPE
##   <chr>          <dbl> <dbl> <dbl>
## 1 Holt-Winters    3.99  3.23  6.31
## 2 Seasonal Naïve  9.13  7.58 14.4

8.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

Among the models tested, Holt-Winters Multiplicative is the most accurate (RMSE: 3.99). While STL+ETS (5.31) improves a great deal over Seasonal Naïve (9.13), but it does not beat Holt-Winters.

#optimal lambda for Box-Cox transformation
lambda <- train %>%
  features(Turnover, guerrero) %>%
  pull(lambda_guerrero)

#stl and ets decomposition model
stl_ets_fitting <- train %>%
  model(
    `STL+ETS` = decomposition_model(
      STL(box_cox(Turnover, lambda) ~ trend(window = 7) + season(window = "periodic"),
          robust = TRUE),
      ETS(season_adjust ~ error("A") + trend("A") + season("N"))
    )
  )

fc_stl_ets <- stl_ets_fitting %>%
  forecast(new_data = test, bias_adjust = TRUE)

## Warning: There was 1 warning in `mutate()`.
## ℹ In argument: `STL+ETS = (function (object, ...) ...`.
## Caused by warning:
## ! The `bias_adjust` argument of `forecast()` is deprecated as of fabletools
##   0.2.0.
## ℹ Please use the `point_forecast` argument instead.
## ℹ The deprecated feature was likely used in the fabletools package.
##   Please report the issue at <https://github.com/tidyverts/fabletools/issues>.

fc_stl_ets %>%
  autoplot(myseries, level = NULL) +
  labs(
    title = "STL Decomposition + ETS on Seasonally Adjusted Data",
    subtitle = paste("Box-Cox transformation (λ =", round(lambda, 3), ")"),
    x = "Month",
    y = "Turnover ($AUD)"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, face = "bold", size = 14),
    plot.subtitle = element_text(hjust = 0.5, size = 11)
  )

stl_ets_acc <- accuracy(fc_stl_ets, test)
print(stl_ets_acc)

## # A tibble: 1 × 12
##   .model  State   Industry .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>   <chr>   <chr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 STL+ETS Tasman… Cafes, … Test  -4.47  5.31  4.55 -9.09  9.26   NaN   NaN 0.714

DATA624

Jackie Yee

2026-03-08

8.1: Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

8.5

8.6

8.7

8.8

8.9