624-HW5

8.1

Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

a. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and ℓ_0, and generate forecasts for the next four months.

b. Compute a 95% prediction interval for the first forecast using ^y±1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

Here I calculate the optimal alpha and initial alpha values: 0.32 and 100646.6. This means the most recent data is weighted at 32%.

# Load the dataset from fpp3
data("aus_livestock")

# Filter the data for pigs in Victoria
pigs <- aus_livestock %>%
  filter(Animal == "Pigs", State == "Victoria") %>%
  select(Month, Count)

pigs_ts <- ts(pigs$Count, start = c(1976, 7), frequency = 12)

fit <- ets(pigs_ts, model = "ANN")

alpha <- fit$par["alpha"]
l0 <- fit$states[1, "l"]

cat("Optimal alpha:", alpha, "\n")

## Optimal alpha: 0.3221247

cat("Initial level l0:", l0, "\n")

## Initial level l0: 100646.6

Here are the 4 month forecasts:

     Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95

Jan 2023 95186.56 83200.06 107173.1 76854.79 113518.3 Feb 2023 95186.56 82593.52 107779.6 75927.17 114445.9 Mar 2023 95186.56 82014.88 108358.2 75042.22 115330.9 Apr 2023 95186.56 81460.61 108912.5 74194.54 116178.6

forecasts <- forecast(fit, h = 4)
print(forecasts)

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 2023       95186.56 83200.06 107173.1 76854.79 113518.3
## Feb 2023       95186.56 82593.52 107779.6 75927.17 114445.9
## Mar 2023       95186.56 82014.88 108358.2 75042.22 115330.9
## Apr 2023       95186.56 81460.61 108912.5 74194.54 116178.6

I got a 95% prediction interval of 76871.01 , 113502.1.

s <- sd(residuals(fit))
cat("Standard deviation of residuals:", s, "\n")

## Standard deviation of residuals: 9344.666

y_hat <- forecasts$mean[1]

lower_limit <- y_hat - 1.96 * s
upper_limit <- y_hat + 1.96 * s

cat("95% Prediction Interval: [", lower_limit, ",", upper_limit, "]\n")

## 95% Prediction Interval: [ 76871.01 , 113502.1 ]

The prediction interval calculated by R is: 76854.79 , 113518.3.

# R's prediction interval
r_lower_limit <- forecasts$lower[1, "95%"]
r_upper_limit <- forecasts$upper[1, "95%"]

cat("R's 95% Prediction Interval: [", r_lower_limit, ",", r_upper_limit, "]\n")

## R's 95% Prediction Interval: [ 76854.79 , 113518.3 ]

The two calculations are about the same, very similar results. The prediction interval is quite large and this means the predictions will be uncertain.

8.5

Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

a. Plot the Exports series and discuss the main features of the data.

b. Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

c. Compute the RMSE values for the training data.

d. Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

e. Compare the forecasts from both methods. Which do you think is best?

f. Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

help("global_economy")

This data contains yearly exports as a percent of GDP. I chose Australia.

data("global_economy")

australia_exports <- global_economy %>%
  filter(Country == "Australia") %>%
  select(Year, Exports)

head(australia_exports)

## # A tsibble: 6 x 2 [1Y]
##    Year Exports
##   <dbl>   <dbl>
## 1  1960    13.0
## 2  1961    12.4
## 3  1962    13.9
## 4  1963    13.0
## 5  1964    14.9
## 6  1965    13.2

autoplot(australia_exports, Exports) +
  labs(title = "Australian Exports (1960-2017)",
       y = "Exports (US$ billions)",
       x = "Year")

The forecast using ETS ANN is a plateau.

ets_ann_model <- australia_exports %>%
  model(ETS(Exports ~ error("A") + trend("N") + season("N")))

ets_ann_forecasts <- ets_ann_model %>%
  forecast(h = 10)  # Forecasting the next 10 years

autoplot(australia_exports, Exports) +
  autolayer(ets_ann_forecasts, level = NULL) +
  labs(title = "ETS(A,N,N) Model Forecasts for Australian Exports",
       y = "Exports (US$ billions)",
       x = "Year")

The RMSE is the sqrt of the mean of the squared residuals. The value I get for the ANN model is: 1.146794

ets_ann_residuals <- residuals(ets_ann_model)$.resid
ets_ann_rmse <- sqrt(mean(ets_ann_residuals^2, na.rm = TRUE))

cat("RMSE for ETS(A,N,N) model:", ets_ann_rmse, "\n")

## RMSE for ETS(A,N,N) model: 1.146794

The ETS AAN forecast shows us an upward trend.

# Fit an ETS(A,A,N) model
ets_aan_model <- australia_exports %>%
  model(ETS(Exports ~ error("A") + trend("A") + season("N")))

# Generate forecasts
ets_aan_forecasts <- ets_aan_model %>%
  forecast(h = 10)

# Plot the forecasts
autoplot(australia_exports, Exports) +
  autolayer(ets_aan_forecasts, level = NULL) +
  labs(title = "ETS(A,A,N) Model Forecasts for Australian Exports",
       y = "Exports (US$ billions)",
       x = "Year")

The RMSE value for the AAN model is: 1.116727. The lower RMSE indicates that the AAN model is better fitted to the data.

ets_aan_residuals <- residuals(ets_aan_model)$.resid
ets_aan_rmse <- sqrt(mean(ets_aan_residuals^2, na.rm = TRUE))

cat("RMSE for ETS(A,N,N) model:", ets_aan_rmse, "\n")

## RMSE for ETS(A,N,N) model: 1.116727

The forecast for the AAN model seems to be the better one. There is clearly an upward trend if you consider seasonality.

ETS(A,N,N) 95% Prediction Interval for first forecast: [ 18.35944 , 22.85488 ]

# First forecast value
y_hat_ann <- ets_ann_forecasts %>% filter(Year == max(Year)) %>% pull(.mean)

# Prediction interval
lower_ann <- y_hat_ann - 1.96 * ets_ann_rmse
upper_ann <- y_hat_ann + 1.96 * ets_ann_rmse

cat("ETS(A,N,N) 95% Prediction Interval for first forecast: [", lower_ann, ",", upper_ann, "]\n")

## ETS(A,N,N) 95% Prediction Interval for first forecast: [ 18.35944 , 22.85488 ]

R’s ETS(A,N,N) 95% Prediction Interval for first forecast: [ 18.3197 , 22.89462 ]

# Extract R's prediction intervals
#r_lower_ann1 <- ets_ann_forecasts %>% slice(1) %>% pull(PI_lower)
#r_upper_ann1 <- ets_ann_forecasts %>% slice(1) %>% pull(PI_upper)

#cat("R's ETS(A,N,N) 95% Prediction Interval for first forecast: [", r_lower_ann1, ",", r_upper_ann1, "]\n")

ETS(A,A,N) 95% Prediction Interval for first forecast: [ 19.8843 , 24.26187 ]

# First forecast value
y_hat_aan <- ets_aan_forecasts %>% filter(Year == max(Year)) %>% pull(.mean)

# Prediction interval
lower_aan <- y_hat_aan - 1.96 * ets_aan_rmse
upper_aan <- y_hat_aan + 1.96 * ets_aan_rmse

cat("ETS(A,A,N) 95% Prediction Interval for first forecast: [", lower_aan, ",", upper_aan, "]\n")

## ETS(A,A,N) 95% Prediction Interval for first forecast: [ 19.8843 , 24.26187 ]

R’s ETS(A,A,N) 95% Prediction Interval for first forecast: [ 18.30981 , 25.83636 ]

# R's prediction intervals
#r_lower_aan1 <- ets_aan_forecasts %>% filter(Year == max(Year)) %>% pull(PI_lower)
#r_upper_aan1 <- ets_aan_forecasts %>% filter(Year == max(Year)) %>% pull(PI_upper)

#cat("R's ETS(A,A,N) 95% Prediction Interval for first forecast: [", r_lower_aan1, ",", r_upper_aan1, "]\n")

8.6

Forecast the Chinese GDP from the global_economy data set using an ETS model.

Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

data("global_economy")
# Filter data for China and select Year and GDP
china_gdp <- global_economy %>%
  filter(Country == "China") %>%
  select(Year, GDP)

# Check the first few rows
head(china_gdp)

## # A tsibble: 6 x 2 [1Y]
##    Year          GDP
##   <dbl>        <dbl>
## 1  1960 59716467625.
## 2  1961 50056868958.
## 3  1962 47209359006.
## 4  1963 50706799903.
## 5  1964 59708343489.
## 6  1965 70436266147.

# Plot the GDP time series
autoplot(china_gdp, GDP) +
  labs(title = "Chinese GDP (1960-2019)",
       y = "GDP (Billion USD)",
       x = "Year") +
  theme_minimal()

horizon <- 30
# Fit ETS(A, N, N) model
ets_ann_model <- china_gdp %>%
  model(ETS_AN_NN = ETS(GDP ~ error("A") + trend("N") + season("N")))

# Summary of the model
report(ets_ann_model)

## Series: GDP 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.9999 
## 
##   Initial states:
##         l[0]
##  67537060018
## 
##   sigma^2:  1.79001e+23
## 
##      AIC     AICc      BIC 
## 3344.888 3345.332 3351.069

# Generate forecasts
forecast_ann <- ets_ann_model %>%
  forecast(h = horizon) %>%
  mutate(PI = hilo(GDP, level = 95)) %>%
  unpack_hilo(PI)

# Plot the forecast
autoplot(china_gdp, GDP) +
  autolayer(forecast_ann, .mean, colour = "blue") +
  autolayer(forecast_ann, lower, colour = "lightblue", linetype = "dashed") +
  autolayer(forecast_ann, upper, colour = "lightblue", linetype = "dashed") +
  labs(title = "Forecast of Chinese GDP using ETS(A, N, N)",
       y = "GDP (Billion USD)",
       x = "Year") +
  theme_minimal() +
  scale_colour_manual(values = c("blue", "lightblue")) +
  guides(colour = guide_legend(title = "Model"))

# Fit ETS(A, A(damped), N) model
ets_aad_damped_model <- china_gdp %>%
  model(ETS = ETS(GDP ~ error("A") + trend("Ad") + season("N")))

# Summary of the model
report(ets_aad_damped_model)

## Series: GDP 
## Model: ETS(A,Ad,N) 
##   Smoothing parameters:
##     alpha = 0.9998747 
##     beta  = 0.5634078 
##     phi   = 0.9799999 
## 
##   Initial states:
##         l[0]       b[0]
##  50284778075 3288256683
## 
##   sigma^2:  3.959258e+22
## 
##      AIC     AICc      BIC 
## 3260.187 3261.834 3272.549

# Generate forecasts
forecast_aad_damped <- ets_aad_damped_model %>%
  forecast(h = horizon) %>%
  mutate(PI = hilo(GDP, level = 95)) %>%
  unpack_hilo(PI)

# Plot the forecast
autoplot(china_gdp, GDP) +
  autolayer(forecast_aad_damped, .mean, colour = "red") +
  autolayer(forecast_aad_damped, lower, colour = "pink", linetype = "dashed") +
  autolayer(forecast_aad_damped, upper, colour = "pink", linetype = "dashed") +
  labs(title = "Forecast of Chinese GDP using ETS(A, A(damped), N)",
       y = "GDP (Billion USD)",
       x = "Year") +
  theme_minimal() +
  scale_colour_manual(values = c("red", "pink")) +
  guides(colour = guide_legend(title = "Model"))

8.7

Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

data("aus_production")

head(aus_production)

## # A tsibble: 6 x 7 [1Q]
##   Quarter  Beer Tobacco Bricks Cement Electricity   Gas
##     <qtr> <dbl>   <dbl>  <dbl>  <dbl>       <dbl> <dbl>
## 1 1956 Q1   284    5225    189    465        3923     5
## 2 1956 Q2   213    5178    204    532        4436     6
## 3 1956 Q3   227    5297    208    561        4806     7
## 4 1956 Q4   308    5681    197    570        4418     6
## 5 1957 Q1   262    5577    187    529        4339     5
## 6 1957 Q2   228    5651    214    604        4811     7

gas_ts <- aus_production %>% select(Quarter, Gas)

gas_ts %>%
  autoplot(Gas) +
  labs(title = "Australian Gas Production",
       y = "Gas Production (PJ)",
       x = "Year") +
  theme_minimal()

ets_model <- gas_ts %>%
  model(ETS(Gas ~ error("A") + trend("A") + season("M")))

report(ets_model)

## Series: Gas 
## Model: ETS(A,A,M) 
##   Smoothing parameters:
##     alpha = 0.6132245 
##     beta  = 0.007863848 
##     gamma = 0.2244134 
## 
##   Initial states:
##      l[0]      b[0]      s[0]    s[-1]    s[-2]     s[-3]
##  3.620487 0.6098927 0.9796171 1.147619 1.075495 0.7972686
## 
##   sigma^2:  18.2237
## 
##      AIC     AICc      BIC 
## 1816.462 1817.328 1846.923

forecast_ets <- ets_model %>%
  forecast(h = "2 years")

forecast_ets %>%
  autoplot(gas_ts) +
  labs(title = "ETS Forecast for Australian Gas Production",
       y = "Gas Production (PJ)",
       x = "Year") +
  theme_minimal()

# Fit ETS model with damped trend
ets_damped <- gas_ts %>%
  model(ETS(Gas ~ error("A") + trend("Ad") + season("M")))

# View the model summary
report(ets_damped)

## Series: Gas 
## Model: ETS(A,Ad,M) 
##   Smoothing parameters:
##     alpha = 0.6100606 
##     beta  = 0.02485618 
##     gamma = 0.2225287 
##     phi   = 0.98 
## 
##   Initial states:
##      l[0]        b[0]      s[0]    s[-1]    s[-2]     s[-3]
##  5.643513 -0.06179413 0.9254594 1.060244 1.124238 0.8900577
## 
##   sigma^2:  18.5452
## 
##      AIC     AICc      BIC 
## 1821.235 1822.297 1855.080

# Forecast using the damped trend model
forecast_damped <- ets_damped %>%
  forecast(h = "2 years")

# Plot the final forecast
forecast_damped %>%
  autoplot(gas_ts) +
  labs(title = "ETS Forecast with Damped Trend for Australian Gas Production",
       y = "Gas Production (PJ)",
       x = "Year") +
  theme_minimal()

The non-dampened model is a little bit better with a lower RMSE and lower AIC.

# Compare AIC
glance(ets_model) %>% select(.model, AIC)

## # A tibble: 1 × 2
##   .model                                                     AIC
##   <chr>                                                    <dbl>
## 1 "ETS(Gas ~ error(\"A\") + trend(\"A\") + season(\"M\"))" 1816.

glance(ets_damped) %>% select(.model, AIC)

## # A tibble: 1 × 2
##   .model                                                      AIC
##   <chr>                                                     <dbl>
## 1 "ETS(Gas ~ error(\"A\") + trend(\"Ad\") + season(\"M\"))" 1821.

# Compare forecast accuracy on the training set
accuracy(ets_model) %>% select(.model, RMSE, MAE, MAPE)

## # A tibble: 1 × 4
##   .model                                                    RMSE   MAE  MAPE
##   <chr>                                                    <dbl> <dbl> <dbl>
## 1 "ETS(Gas ~ error(\"A\") + trend(\"A\") + season(\"M\"))"  4.19  2.84  5.03

accuracy(ets_damped) %>% select(.model, RMSE, MAE, MAPE)

## # A tibble: 1 × 4
##   .model                                                     RMSE   MAE  MAPE
##   <chr>                                                     <dbl> <dbl> <dbl>
## 1 "ETS(Gas ~ error(\"A\") + trend(\"Ad\") + season(\"M\"))"  4.22  2.81  4.11

8.8

Recall your retail time series data (from Exercise 7 in Section 2.10).

set.seed(12345678)

myseries <- aus_retail |>

#### filter(Series ID == sample(aus_retail$Series ID,1))

a. Why is multiplicative seasonality necessary for this series?

b. Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

c. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

d. Check that the residuals from the best method look like white noise.

e. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

Based on the plot below, it looks like the seasonal variation is increasing as the level increases- the seasonality is multiplicative. We need to consider this change when we forecast.

# Set seed for reproducibility
set.seed(12345678)

# Select a random retail series
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`, 1))

# View the selected series
print(myseries)

## # A tsibble: 369 x 5 [1M]
## # Key:       State, Industry [1]
##    State              Industry                     `Series ID`    Month Turnover
##    <chr>              <chr>                        <chr>          <mth>    <dbl>
##  1 Northern Territory Clothing, footwear and pers… A3349767W   1988 Apr      2.3
##  2 Northern Territory Clothing, footwear and pers… A3349767W   1988 May      2.9
##  3 Northern Territory Clothing, footwear and pers… A3349767W   1988 Jun      2.6
##  4 Northern Territory Clothing, footwear and pers… A3349767W   1988 Jul      2.8
##  5 Northern Territory Clothing, footwear and pers… A3349767W   1988 Aug      2.9
##  6 Northern Territory Clothing, footwear and pers… A3349767W   1988 Sep      3  
##  7 Northern Territory Clothing, footwear and pers… A3349767W   1988 Oct      3.1
##  8 Northern Territory Clothing, footwear and pers… A3349767W   1988 Nov      3  
##  9 Northern Territory Clothing, footwear and pers… A3349767W   1988 Dec      4.2
## 10 Northern Territory Clothing, footwear and pers… A3349767W   1989 Jan      2.7
## # ℹ 359 more rows

# Plot the selected retail series
myseries %>%
  autoplot(Turnover) +
  labs(title = "Selected Australian Retail Series",
       y = "Retail Volume",
       x = "Year") +
  theme_minimal()

Modeling using holt-winters + with damped

# Fit ETS model with multiplicative seasonality and additive trend
hw_model <- myseries %>%
  model(
    HW = ETS(Turnover ~ error("A") + trend("A") + season("M"))
  )

# View the model summary
report(hw_model)

## Series: Turnover 
## Model: ETS(A,A,M) 
##   Smoothing parameters:
##     alpha = 0.5496504 
##     beta  = 0.000100498 
##     gamma = 0.2016824 
## 
##   Initial states:
##      l[0]       b[0]      s[0]     s[-1]     s[-2]    s[-3]     s[-4]    s[-5]
##  2.352811 0.03340494 0.8126332 0.7982999 0.7748942 1.322646 0.9919646 1.059222
##     s[-6]    s[-7]    s[-8]    s[-9]    s[-10]    s[-11]
##  1.004043 1.137424 1.208147 1.022139 0.9978448 0.8707408
## 
##   sigma^2:  0.3757
## 
##      AIC     AICc      BIC 
## 1837.503 1839.247 1903.987

# Fit ETS model with multiplicative seasonality and damped trend
hw_damped <- myseries %>%
  model(
    HW_Damped = ETS(Turnover ~ error("A") + trend("Ad") + season("M"))
  )

# View the model summary
report(hw_damped)

## Series: Turnover 
## Model: ETS(A,Ad,M) 
##   Smoothing parameters:
##     alpha = 0.5562002 
##     beta  = 0.0001072746 
##     gamma = 0.2072605 
##     phi   = 0.8441225 
## 
##   Initial states:
##      l[0]        b[0]      s[0]     s[-1]     s[-2]    s[-3]     s[-4]    s[-5]
##  2.639056 -0.05159711 0.8027897 0.8055812 0.8195859 1.307415 0.9718913 1.063301
##     s[-6]    s[-7]    s[-8]    s[-9]    s[-10]    s[-11]
##  1.026827 1.095723 1.216473 1.050848 0.9988038 0.8407616
## 
##   sigma^2:  0.3805
## 
##      AIC     AICc      BIC 
## 1843.129 1845.084 1913.524

1 HW 0.600 2 HW_Damped 0.602

The un-damped model has a slightly lower RMSE, making it technically the better model.

# Extract RMSE for both models
rmse_comparison <- myseries %>%
  model(
    HW = ETS(Turnover ~ error("A") + trend("A") + season("M")),
    HW_Damped = ETS(Turnover ~ error("A") + trend("Ad") + season("M"))
  ) %>%
  accuracy() %>%
  select(.model, RMSE)

print(rmse_comparison)

## # A tibble: 2 × 2
##   .model     RMSE
##   <chr>     <dbl>
## 1 HW        0.600
## 2 HW_Damped 0.602

The plots look good, the biggest indicator being the histogram- it shows a fairly normal distribution of the residuals.

# Residual diagnostics for HW_Damped model
hw_model %>%
  gg_tsresiduals()