Chapter 8: Exponential Smoothing

Author

Griffin Lessinger

Introduction

Chapter 8 from the text (Forecasting: Principles and Practice) is all about exponential smoothing as a cross between naive forecasting and using the unweighted mean for forecasts. We will look at two problems from the text relating to exponential smoothing and optimizing the given models for forecasts.

1. Pigs from aus_livestock

The data from aus_livestock is a time series dataset that tracks the estimated counts of various livestock animals within different Australian regions from 1972 to 2018, on a monthly basis. We will consider only the subset of aus_livestock that tracks pigs within Victoria:

Code
# Filter big dataset for subset of pigs in victoria
pigs <- aus_livestock |>
  filter(Animal == "Pigs",
         State == "Victoria")

# Show subset preview
pigs[1:10, ]
# A tsibble: 10 x 4 [1M]
# Key:       Animal, State [1]
      Month Animal State     Count
      <mth> <fct>  <fct>     <dbl>
 1 1972 Jul Pigs   Victoria  94200
 2 1972 Aug Pigs   Victoria 105700
 3 1972 Sep Pigs   Victoria  96500
 4 1972 Oct Pigs   Victoria 117100
 5 1972 Nov Pigs   Victoria 104600
 6 1972 Dec Pigs   Victoria 100500
 7 1973 Jan Pigs   Victoria  94700
 8 1973 Feb Pigs   Victoria  93900
 9 1973 Mar Pigs   Victoria  93200
10 1973 Apr Pigs   Victoria  78000
Code
# Show plot of series, for fun
pigs |> autoplot(Count) + labs(title = "Victorian pig counts over time")

(a). Using ETS(), estimate an equivalent model for simple exponential smoothing, generate four monthly forecasts

Code
# Use ETS() to find model
pigs.fit <- pigs |>
  model(ETS(Count ~ error("A") + trend("N") + season("N")))

# Make model report to get parameters
report(pigs.fit)
Series: Count 
Model: ETS(A,N,N) 
  Smoothing parameters:
    alpha = 0.3221247 

  Initial states:
     l[0]
 100646.6

  sigma^2:  87480760

     AIC     AICc      BIC 
13737.10 13737.14 13750.07 

For a simple exponential smoothing model, we get \(\alpha \approx 0.322\), \(l_0 \approx 100646.6\).

Code
# Generate forecasts
pigs.preds <- pigs.fit |>
  forecast(h = 48)

# Plot forecasts
pigs.preds |>
  autoplot(pigs) +
  geom_line(
    data = augment(pigs.fit),
    aes(y = .fitted),
    col = "orange2"
  ) +
  labs(
    title = "Victorian pig counts over time",
    subtitle = "Simple exponential smoothing model"
  )

The forecasts are “flat”, which is expected (as explained in the text). This is because the forecasts rely on no overall trend or seasonality, but instead, a weighted average of previous terms (with weighting parameters given above).

(b). Compute a 95% prediction interval for the first forecast using \(\hat{y} \pm 1.96s\) where \(s\) is the standard deviation of the residuals

The residuals of the model are given by:

Code
resids <- augment(pigs.fit)$.resid

We can easily compute the standard deviation of the residuals, then the confidence interval, like so:

Code
# Standard deviation of population
stdev <- function(x) {
  # Compute mean
  vmean <- sum(x)/length(x)
  
  # Compute sum of squared deviations
  sqdev <- sum((x - vmean)**2)
  
  # Divide by length of x, take square root
  return (sqrt(sqdev/length(x)))
}

# Standard deviation of residuals
s <- stdev(resids)

# Using y-hat as first predicted value of 95186.56
yhat <- 95186.56
ci <- c(yhat - 1.96*s, yhat + 1.96*s)
print(
  data.frame(
    Intervals = c(paste0("95% CI (mine): ", "(", trunc(ci[1]), ", ", trunc(ci[2]), ")"),
           "95% CI (by R): (76854, 113518)")
  )
)
                       Intervals
1 95% CI (mine): (76887, 113485)
2 95% CI (by R): (76854, 113518)

Pretty close! The slight difference is easily explainable by rounding errors. I used 1.96 as the z-score, R probably used a slightly more precise value.

2. Data from global_economy

The data from global_economy is a collection of time series that tracks economic metrics for many countries, yearly, from 1960 to 2017. We’ll examine the subset that focuses on exports (as % of GDP) from Mexico.

(a). Plot the Exports series

Here is both a sample of the data and a plot (from 1960 to 2017):

Code
# Filter big dataset
mex <- global_economy |>
  filter(Country == "Mexico") |>
  select(Year, Exports)

# Show sample of data
mex[1:10, ]
# A tsibble: 10 x 2 [1Y]
    Year Exports
   <dbl>   <dbl>
 1  1960    8.51
 2  1961    8.41
 3  1962    8.57
 4  1963    8.32
 5  1964    7.63
 6  1965    7.63
 7  1966    7.47
 8  1967    6.87
 9  1968    7.04
10  1969    7.55
Code
# Show plot of data
mex |>
  autoplot(Exports) +
  labs(title = "Annual exports from Mexico", y = "% of GDP")

We can see a general steep increase from 1960 onward, only dipping from seemingly random fluctuation and a harsh decline in the late 1980s - early 1990s. Total increase in % of GDP is about 30% across all years. Note, the data does not appear seasonal, but there is a strong trend! We will have to take that into account when using the exponential smoothing models.

(b). Use an ETS(A,N,N) model to forecast the series, and plot the forecasts

Code
# Train ETS(A,N,N) model
mex.fit.ANN <- mex |>
  model(ETS(Exports ~ error("A") + trend("N") + season("N")))

# Generate model report
report(mex.fit.ANN)
Series: Exports 
Model: ETS(A,N,N) 
  Smoothing parameters:
    alpha = 0.9998997 

  Initial states:
     l[0]
 8.503378

  sigma^2:  4.8073

     AIC     AICc      BIC 
330.5385 330.9829 336.7198 
Code
# Generate forecasts
mex.preds.ANN <- mex.fit.ANN |>
  forecast(h = 5)

# Plot forecasts
mex.preds.ANN |>
  autoplot(mex) +
  labs(
    title = "Annual exports from Mexico",
    subtitle = "Simple exponential smoothing model",
    y = "% of GDP"
  )

That \(\alpha \approx 0.999\) value is extremely high, suggesting that this model would really like to just be a naive model. This makes sense; the naive model gives the highest possible prediction value (as the series ends at the highest \(y\) value present in the data), and there is clearly a positive trend in the data that we are not allowing the model to use. So in doing the best that it can, in this case, it picks the naive model.

(c). Compute the RMSE values for the training data

Code
# Get residuals from model
resids <- augment(mex.fit.ANN)$.resid

# Square residuals
sqresids <- resids**2

# Get sqrt of mean of squared residuals
rmse <- sqrt(mean(sqresids))

# Show
print(paste0("RMSE: ", trunc(100*rmse)/100))
[1] "RMSE: 2.15"

(d + e). Compare the results to those from an ETS(A,A,N) model

Code
mex.fit.AAN <- mex |>
  model(ETS(Exports ~ error("A") + trend("A") + season("N")))

# Generate model report
report(mex.fit.AAN)
Series: Exports 
Model: ETS(A,A,N) 
  Smoothing parameters:
    alpha = 0.9998999 
    beta  = 0.0001000498 

  Initial states:
     l[0]      b[0]
 8.449858 0.5158926

  sigma^2:  4.7096

     AIC     AICc      BIC 
331.2385 332.3923 341.5407 
Code
# Generate forecasts
mex.preds.AAN <- mex.fit.AAN |>
  forecast(h = 5)

# Plot forecasts
mex.preds.AAN |>
  autoplot(mex) +
  labs(
    title = "Annual exports from Mexico",
    subtitle = "ETS(A,A,N) model",
    y = "% of GDP"
  )

Code
# Get residuals from model
resids <- augment(mex.fit.AAN)$.resid

# Square residuals
sqresids <- resids**2

# Get sqrt of mean of squared residuals
rmse <- sqrt(mean(sqresids))

# Show
print(paste0("RMSE: ", trunc(100*rmse)/100))
[1] "RMSE: 2.09"

This second model captures the trend of the data much better, as we are actually including a trend component this time. The second model uses Holt’s method which adds a second trending parameter to the recursive relation defining the level equation. As such, it is notably more complicated and less interpretable than the simpler ETS(A,N,N) model.

However, it does a better job overall, with a lower RMSE and actual trend-capturing-ability where there clearly is a trend. We can see the trend preserved in the plotted forecasts of the ETS(A,N,N) model, it looks almost like the trend-based RW() model that we have seen in earlier chapters.

(f). Calculate a 95% prediction interval for the first forecast for each model, using the found RMSE values

Code
# Get RMSE values from before
rmse.AAN <- 2.15
rmse.ANN <- 2.09

# Use standard deviation ~ RMSE (as errors are assumed normal, plus mean of 0)
ci.AAN <- c(38.38224 - 1.96*rmse.AAN, 38.38224 + 1.96*rmse.AAN)
ci.ANN <- c(37.86634 - 1.96*rmse.ANN, 37.86634 + 1.96*rmse.ANN)

print(
  data.frame(
    Intervals = c(
      paste0("95% CI of ETS(A,A,N) (mine): ", "(", trunc(10*ci.AAN[1])/10, ", ", trunc(10*ci.AAN[2])/10, ")"),
      "95% CI of ETS(A,A,N) (by R): (34.1, 42.6)",
      paste0("95% CI of ETS(A,N,N) (mine): ", "(", trunc(10*ci.ANN[1])/10, ", ", trunc(10*ci.ANN[2])/10, ")"),
      "95% CI of ETS(A,N,N) (by R): (33.6, 42.2)"
      )
  )
)
                                  Intervals
1 95% CI of ETS(A,A,N) (mine): (34.1, 42.5)
2 95% CI of ETS(A,A,N) (by R): (34.1, 42.6)
3 95% CI of ETS(A,N,N) (mine): (33.7, 41.9)
4 95% CI of ETS(A,N,N) (by R): (33.6, 42.2)

Quite close again!