Exponential Smoothing

1

Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of \(\alpha_0\) and \(l_0\), and generate forecasts for the next four months.

\(\alpha = 0.322\) \(l_0 = 100646.6\)

# dividing count by 1000
vic_pig <- aus_livestock %>%
  filter(
    Animal == 'Pigs',
    State == 'Victoria'
  ) %>%
  mutate(Count = Count / 1000)

fit <-  vic_pig %>%
  model(ETS(Count ~ error('A') + trend('N') + season('N')))

fc <- fit %>%
  forecast(h = 4) 

report(fit)

## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3219175 
## 
##   Initial states:
##      l[0]
##  100.4949
## 
##   sigma^2:  87.4807
## 
##      AIC     AICc      BIC 
## 6028.040 6028.084 6041.013

Compute a 95% prediction interval for the first forecast using \(\hat y \pm 1.96s\) where \(s\) is the standard deviation of the residuals. Compare your interval with the interval produced by R.

The lower bound of residuals is -14.22 and upper bound 14.63 in thousands. This supports the confidence interval produced by R.

x <- residuals(fit)

quantile(x$.resid, .05)

##        5% 
## -14.22564

quantile(x$.resid, .95)

##      95% 
## 14.63187

fc %>% 
  autoplot(subset(vic_pig, Animal == 'Pigs' & State == 'Victoria')) +
  geom_line(aes(y = .fitted), col = '#D55E00',
              data = augment(fit))

5

Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

Plot the Exports series and discuss the main features of the data.

The trend, seasonal, and cyclical patterns are not strong.

argentina_economy <- global_economy %>%
  filter(Country == 'Argentina') 

argentina_economy %>% 
  autoplot(Exports)

Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

fit <- argentina_economy %>%
  model(ETS(Exports ~ error('A') + trend('N') + season('N'))) 

fc <- fit %>%
  forecast(h = 5)

fc %>%
  autoplot(argentina_economy) +
  geom_line(aes(y = .fitted), col="#D55E00",
            data = augment(fit)) +
  labs(y="% of GDP", title="Exports: Argentina") +
  guides(colour = "none")

Compute the RMSE values for the training data.

(rmse1 <- argentina_economy %>%
  stretch_tsibble(.init = 5) %>%
  model(ETS(Exports ~ error('A') + trend('N') + season('N'))) %>%
  forecast(h = 1) %>%
  accuracy(argentina_economy) %>%
  select(RMSE))

## Warning: The future dataset is incomplete, incomplete out-of-sample data will be treated as missing. 
## 1 observation is missing at 2018

## # A tibble: 1 x 1
##    RMSE
##   <dbl>
## 1  2.99

Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

fit2 <- argentina_economy %>%
  model(ETS(Exports ~ error('A') + trend('A') + season('N'))) 

fc2 <- fit2 %>%
  forecast(h = 5)

fc2 %>%
  autoplot(argentina_economy) +
  geom_line(aes(y = .fitted), col="#D55E00",
            data = augment(fit2)) +
  labs(y="% of GDP", title="Exports: Argentina") +
  guides(colour = "none")

Compare the forecasts from both methods. Which do you think is best?

This change does not have an impact on the forecast as there is no apparant trend pattern.

Calculate a 95% prediction interval for the first

exports <- fit %>%
  augment() %>%
  select(Exports)

quantile(exports$Exports, 0.05)

##       5% 
## 5.595889

quantile(exports$Exports, 0.95)

##      95% 
## 23.33614

forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

6

Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]

# China GDP in Billions
china_economy <- global_economy %>%
  select(GDP) %>%
  filter(Country == 'China') %>%
  mutate(GDP = GDP / 1e9)
  
# lambda for boxcox transformation
lambda_gdp <- china_economy %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)

# models
china_economy %>%
  model(
    Holt = ETS(GDP ~ error('A') + trend('A') + season('N')),
    Damped = ETS(GDP ~ error('A') + trend('Ad') + season('N')),
    boxcox = ETS(box_cox(GDP, lambda_gdp) ~ error('A') + trend('A') + season('N'))
  ) %>%
  forecast(h = 5)%>%
  autoplot(china_economy)

7

Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

There is an apparant seasonal pattern therefore a multiplicative approach is necessary. The damped trend and the non damped trend have a very similary forecast.

# using ETS framwork to create model 
aus_production %>%
  model(ETS(Gas)) %>%
  report()

## Series: Gas 
## Model: ETS(M,A,M) 
##   Smoothing parameters:
##     alpha = 0.6528545 
##     beta  = 0.1441675 
##     gamma = 0.09784922 
## 
##   Initial states:
##      l[0]       b[0]      s[0]    s[-1]    s[-2]     s[-3]
##  5.945592 0.07062881 0.9309236 1.177883 1.074851 0.8163427
## 
##   sigma^2:  0.0032
## 
##      AIC     AICc      BIC 
## 1680.929 1681.794 1711.389

fit <- aus_production %>%
  model(
    MAM = ETS(Gas ~ error('M') + trend('A') + season('M')),
    MAdM = ETS(Gas ~ error('M') + trend('Ad') + season('M'))
  )

fit %>%
  forecast(h = 12) %>%
  autoplot(aus_production, level= NULL)

8

Recall your retail time series data (from Exercise 8 in Section 2.10).

Why is multiplicative seasonality necessary for this series?

There is an apparant seasonal pattern therefore a multiplicative approach is necessary. The damped trend and the non damped trend have a very similary forecast.

set.seed(614)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

myseries %>%
  autoplot()

## Plot variable not specified, automatically selected `.vars = Turnover`

myseries %>% 
  model(ETS(Turnover)) %>% 
  report()

## Series: Turnover 
## Model: ETS(M,A,M) 
##   Smoothing parameters:
##     alpha = 0.3727079 
##     beta  = 0.003201077 
##     gamma = 0.1329782 
## 
##   Initial states:
##      l[0]      b[0]      s[0]     s[-1]    s[-2]    s[-3]    s[-4]    s[-5]
##  11.78114 0.2100095 0.9765103 0.9372128 0.971017 1.167349 1.021502 1.018298
##      s[-6]     s[-7]    s[-8]     s[-9]   s[-10]    s[-11]
##  0.9568703 0.9733818 1.006006 0.9811837 0.990874 0.9997952
## 
##   sigma^2:  0.0013
## 
##      AIC     AICc      BIC 
## 3448.720 3450.166 3518.233

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

fit <- myseries %>%
  model(
    MAdM = ETS(Turnover ~ error('M') + trend('Ad') + season('M'))
  )

fc <- fit %>% forecast(h = 12) 
fc %>%
  autoplot(myseries)

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

The damped trend performed better in this case.

fc1 <- myseries %>%
  filter(year(Month) < 2018) %>%
  model(
    MAM = ETS(Turnover ~ error('M') + trend('A') + season('M'))
  ) %>%
  forecast(h = 12) 
  
fc1 %>%
  accuracy(myseries) %>%
  select(RMSE)

## # A tibble: 1 x 1
##    RMSE
##   <dbl>
## 1  5.14

fc2 <- myseries %>%
  filter(year(Month) < 2018) %>%
  model(
    MAdM = ETS(Turnover ~ error('M') + trend('Ad') + season('M'))
  ) %>%
  forecast(h = 1) 

fc2 %>%
  accuracy(myseries) %>%
  select(RMSE)

## Warning: 1 error encountered
## [1] subscript out of bounds

## # A tibble: 1 x 1
##    RMSE
##   <dbl>
## 1  4.30

Check that the residuals from the best method look like white noise.

myseries %>% 
  model(damped = ETS(Turnover ~ error('M') + trend('Ad') + season('M'))) %>%
  gg_tsresiduals()

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

The new Seasonal Multiplicative model has a superior RMSE than the NAIVE model approach.

myseries_train <- myseries %>%
  filter(year(Month) < 2011)

fit1 <- myseries_train %>%
  model(ETS = ETS(Turnover)) 

fc1 <- fit1 %>%
  forecast(new_data = anti_join(myseries, myseries_train)) %>%
  accuracy(myseries)

## Joining, by = c("State", "Industry", "Series ID", "Month", "Turnover")

fc1$RMSE

## [1] 25.19869

fit2 <- myseries_train %>%
  model(SNAIVE(Turnover)) 

fc2 <- fit2 %>%
  forecast(new_data = anti_join(myseries, myseries_train)) %>%
  accuracy(myseries)

## Joining, by = c("State", "Industry", "Series ID", "Month", "Turnover")

fc2$RMSE

## [1] 44.88057

9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

lambda_turnover <- myseries_train %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

myseries_train2 <- myseries_train %>%
  mutate(Turnover = box_cox(myseries_train$Turnover, lambda_turnover))

dcmp <- myseries_train2 %>%
  model(STL = STL(Turnover))

temp <- components(dcmp) %>%
  select(season_adjust) %>%
  rename('Turnover' = 'season_adjust')

fit <- components(dcmp) %>%
  model(ETS = ETS(season_adjust ~ error('M') + trend('A') + season('M')))
#     
# fit %>%
#   forecast(new_data = anti_join(myseries2, myseries_train)) %>%
#   accuracy(myseries)

DATA624 - HW5

Diego Correa

10/10/2021

Exponential Smoothing

1

5

6

7

8

9