DATA 624 Week 5 Homework: Exponential Smoothing

suppressWarnings({
  library(fpp3)
  library(dplyr)
  library(tidyr)
  library(fable)
  library(fabletools)
})

## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr

## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.0 ──

## ✔ tibble      3.2.1     ✔ tsibble     1.1.5
## ✔ dplyr       1.1.2     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.0     ✔ feasts      0.3.2
## ✔ lubridate   1.9.2     ✔ fable       0.3.4
## ✔ ggplot2     3.5.1     ✔ fabletools  0.4.2

## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()

1.

Consider the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

victoria_pigs <- aus_livestock %>%
  filter(Animal == "Pigs") %>%
  filter(State == "Victoria")

###a.) Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of
\(a\) and \(l_0\), and generate forecasts for the next four months.

victorian_pigs_fit <- victoria_pigs %>%
  model(ETS(Count ~ error("A") + trend("N") + season("N")))

tidy(victorian_pigs_fit)

## # A tibble: 2 × 5
##   Animal State    .model                                          term  estimate
##   <fct>  <fct>    <chr>                                           <chr>    <dbl>
## 1 Pigs   Victoria "ETS(Count ~ error(\"A\") + trend(\"N\") + sea… alpha  3.22e-1
## 2 Pigs   Victoria "ETS(Count ~ error(\"A\") + trend(\"N\") + sea… l[0]   1.01e+5

victorian_pigs_forecast <- victorian_pigs_fit %>%
  forecast(h = 4)

victorian_pigs_forecast %>%
  autoplot(victoria_pigs) +
  geom_line(aes(y = .fitted), col="#D55E00",
            data = augment(victorian_pigs_fit)) +
  labs(y="Count", title="Forecast: Pigs Slaughtered in Victoria") +
  guides(colour = "none")

b.)

Compute a 95% prediction interval for the first forecast using \(\hat{y} \pm 1.96 s\), where \(s\) is the standard deviation of the residuals. Compare your interval with the interval produced by R.

victorian_pigs_forecast

## # A fable: 4 x 6 [1M]
## # Key:     Animal, State, .model [1]
##   Animal State    .model                          Month             Count  .mean
##   <fct>  <fct>    <chr>                           <mth>            <dist>  <dbl>
## 1 Pigs   Victoria "ETS(Count ~ error(\"A\") +… 2019 Jan N(95187, 8.7e+07) 95187.
## 2 Pigs   Victoria "ETS(Count ~ error(\"A\") +… 2019 Feb N(95187, 9.7e+07) 95187.
## 3 Pigs   Victoria "ETS(Count ~ error(\"A\") +… 2019 Mar N(95187, 1.1e+08) 95187.
## 4 Pigs   Victoria "ETS(Count ~ error(\"A\") +… 2019 Apr N(95187, 1.1e+08) 95187.

victorian_pigs_fit %>%
  augment() %>%
  pull(.resid) %>%
  sd()

## [1] 9344.666

\[ \hat{y} \pm 1.96 s\\ 95187 \pm 1.96(9344.666)\\ (76871.4546, 113502.545) \]

hilo(victorian_pigs_forecast)

## # A tsibble: 4 x 8 [1M]
## # Key:       Animal, State, .model [1]
##   Animal State   .model    Month             Count  .mean                  `80%`
##   <fct>  <fct>   <chr>     <mth>            <dist>  <dbl>                 <hilo>
## 1 Pigs   Victor… "ETS(… 2019 Jan N(95187, 8.7e+07) 95187. [83200.06, 107173.1]80
## 2 Pigs   Victor… "ETS(… 2019 Feb N(95187, 9.7e+07) 95187. [82593.52, 107779.6]80
## 3 Pigs   Victor… "ETS(… 2019 Mar N(95187, 1.1e+08) 95187. [82014.88, 108358.2]80
## 4 Pigs   Victor… "ETS(… 2019 Apr N(95187, 1.1e+08) 95187. [81460.61, 108912.5]80
## # ℹ 1 more variable: `95%` <hilo>

The 95% confidence interval I calculated for the first forecast is reasonably close to the one produced by R.

5.

Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

irish_economy <- global_economy %>%
  filter(Country == "Ireland")

a.)

Plot the Exports series and discuss the main features of the data.

irish_economy %>%
  autoplot(Exports) +
  labs(y="Exports (% of GDP)", title="Irish Exports, 1960 to 2017")

There is a strong upward trend and no clear seasonal component.

b.)

Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

irish_ann_fit <- irish_economy %>%
  model(ETS(Exports ~ error("A") + trend("N") + season("N")))

tidy(irish_ann_fit)

## # A tibble: 2 × 4
##   Country .model                                                  term  estimate
##   <fct>   <chr>                                                   <chr>    <dbl>
## 1 Ireland "ETS(Exports ~ error(\"A\") + trend(\"N\") + season(\"… alpha     1.00
## 2 Ireland "ETS(Exports ~ error(\"A\") + trend(\"N\") + season(\"… l[0]     29.5

irish_ann_forecast <- irish_ann_fit %>%
  forecast(h = 4)

irish_ann_forecast %>%
  autoplot(irish_economy) +
  geom_line(aes(y = .fitted), col="#D55E00",
            data = augment(irish_ann_fit)) +
  labs(y="Exports (% of GDP)", title="Forecast: Irish Exports (ETS(A,N,N) Model)") +
  guides(colour = "none")

c.)

Compute the RMSE values for the training data.

accuracy(irish_ann_fit)

## # A tibble: 1 × 11
##   Country .model           .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <fct>   <chr>            <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Ireland "ETS(Exports ~ … Trai…  1.56  4.03  2.94  2.25  4.64 0.983 0.991 0.287

The RMSE is 4.02662.

d.)

Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

irish_aan_fit <- irish_economy %>%
  model(ETS(Exports ~ error("A") + trend("A") + season("N")))

tidy(irish_aan_fit)

## # A tibble: 4 × 4
##   Country .model                                                  term  estimate
##   <fct>   <chr>                                                   <chr>    <dbl>
## 1 Ireland "ETS(Exports ~ error(\"A\") + trend(\"A\") + season(\"… alpha  9.99e-1
## 2 Ireland "ETS(Exports ~ error(\"A\") + trend(\"A\") + season(\"… beta   5.92e-4
## 3 Ireland "ETS(Exports ~ error(\"A\") + trend(\"A\") + season(\"… l[0]   3.04e+1
## 4 Ireland "ETS(Exports ~ error(\"A\") + trend(\"A\") + season(\"… b[0]   1.20e+0

irish_aan_forecast <- irish_aan_fit %>%
  forecast(h = 4)

irish_aan_forecast %>%
  autoplot(irish_economy) +
  geom_line(aes(y = .fitted), col="#D55E00",
            data = augment(irish_aan_fit)) +
  labs(y="Exports (% of GDP)", title="Forecast: Irish Exports (ETS(A,A,N) Model)") +
  guides(colour = "none")

e.)

Compare the forecasts from both methods. Which do you think is best?

report(irish_ann_fit)

## Series: Exports 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.9998999 
## 
##   Initial states:
##      l[0]
##  29.52361
## 
##   sigma^2:  16.7927
## 
##      AIC     AICc      BIC 
## 403.0853 403.5297 409.2666

report(irish_aan_fit)

## Series: Exports 
## Model: ETS(A,A,N) 
##   Smoothing parameters:
##     alpha = 0.9985085 
##     beta  = 0.0005916509 
## 
##   Initial states:
##      l[0]     b[0]
##  30.40579 1.202241
## 
##   sigma^2:  15.0077
## 
##      AIC     AICc      BIC 
## 398.4577 399.6116 408.7599

I think the A,A,N forecast is better than the A,N,N one, because the trend is a significant feature of this data, and only the A,A,N forecast accounts for and continues the trend. This is backed up by the fact that the A,A,N model minimizes the AICc.

f.)

Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

ANN Forecast:

irish_ann_forecast

## # A fable: 4 x 5 [1Y]
## # Key:     Country, .model [1]
##   Country .model                                           Year    Exports .mean
##   <fct>   <chr>                                           <dbl>     <dist> <dbl>
## 1 Ireland "ETS(Exports ~ error(\"A\") + trend(\"N\") + s…  2018 N(120, 17)  120.
## 2 Ireland "ETS(Exports ~ error(\"A\") + trend(\"N\") + s…  2019 N(120, 34)  120.
## 3 Ireland "ETS(Exports ~ error(\"A\") + trend(\"N\") + s…  2020 N(120, 50)  120.
## 4 Ireland "ETS(Exports ~ error(\"A\") + trend(\"N\") + s…  2021 N(120, 67)  120.

\[ \hat{y} \pm 1.96 s\\ 120 \pm 1.96(4.02662)\\ (112.1078, 127.8922) \]

hilo(irish_ann_forecast)

## # A tsibble: 4 x 7 [1Y]
## # Key:       Country, .model [1]
##   Country .model                    Year    Exports .mean                  `80%`
##   <fct>   <chr>                    <dbl>     <dist> <dbl>                 <hilo>
## 1 Ireland "ETS(Exports ~ error(\"…  2018 N(120, 17)  120. [114.7610, 125.2643]80
## 2 Ireland "ETS(Exports ~ error(\"…  2019 N(120, 34)  120. [112.5861, 127.4393]80
## 3 Ireland "ETS(Exports ~ error(\"…  2020 N(120, 50)  120. [110.9171, 129.1082]80
## 4 Ireland "ETS(Exports ~ error(\"…  2021 N(120, 67)  120. [109.5102, 130.5152]80
## # ℹ 1 more variable: `95%` <hilo>

AAN Forecast:

irish_aan_forecast

## # A fable: 4 x 5 [1Y]
## # Key:     Country, .model [1]
##   Country .model                                           Year    Exports .mean
##   <fct>   <chr>                                           <dbl>     <dist> <dbl>
## 1 Ireland "ETS(Exports ~ error(\"A\") + trend(\"A\") + s…  2018 N(121, 15)  121.
## 2 Ireland "ETS(Exports ~ error(\"A\") + trend(\"A\") + s…  2019 N(122, 30)  122.
## 3 Ireland "ETS(Exports ~ error(\"A\") + trend(\"A\") + s…  2020 N(124, 45)  124.
## 4 Ireland "ETS(Exports ~ error(\"A\") + trend(\"A\") + s…  2021 N(125, 60)  125.

accuracy(irish_aan_fit)

## # A tibble: 1 × 11
##   Country .model          .type    ME  RMSE   MAE    MPE  MAPE  MASE RMSSE  ACF1
##   <fct>   <chr>           <chr> <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Ireland "ETS(Exports ~… Trai… 0.345  3.74  2.80 -0.107  4.55 0.936 0.920 0.286

\[ \hat{y} \pm 1.96 s\\ 121 \pm 1.96(3.738005)\\ (113.6735, 128.3265) \]

hilo(irish_aan_forecast)

## # A tsibble: 4 x 7 [1Y]
## # Key:       Country, .model [1]
##   Country .model                    Year    Exports .mean                  `80%`
##   <fct>   <chr>                    <dbl>     <dist> <dbl>                 <hilo>
## 1 Ireland "ETS(Exports ~ error(\"…  2018 N(121, 15)  121. [116.2661, 126.1955]80
## 2 Ireland "ETS(Exports ~ error(\"…  2019 N(122, 30)  122. [115.4269, 129.4628]80
## 3 Ireland "ETS(Exports ~ error(\"…  2020 N(124, 45)  124. [115.0633, 132.2546]80
## 4 Ireland "ETS(Exports ~ error(\"…  2021 N(125, 60)  125. [114.9459, 134.8001]80
## # ℹ 1 more variable: `95%` <hilo>

Both prediction intervals I calculated are reasonably close to those produced by R.

6.

Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]

chinese_economy <- global_economy %>%
  filter(Country == "China") %>%
  mutate(gdp_bil = GDP/1e9)

chinese_economy %>%
  autoplot(gdp_bil) +
  labs(y = "GDP (Billions of USD)", title = "Chinese GDP, 1960 to 2017")

chinese_economy_fits <- chinese_economy %>%
  model(no_trend = ETS(GDP ~ error("A") + trend("N") + season("N")),
        additive_trend = ETS(GDP ~ error("A") + trend("A") + season("N")),
        additive_damped = ETS(GDP ~ error("A") + trend("Ad") + season("N")))

tidy(chinese_economy_fits)

## # A tibble: 11 × 4
##    Country .model          term  estimate
##    <fct>   <chr>           <chr>    <dbl>
##  1 China   no_trend        alpha 1.00e+ 0
##  2 China   no_trend        l[0]  6.75e+10
##  3 China   additive_trend  alpha 1.00e+ 0
##  4 China   additive_trend  beta  5.52e- 1
##  5 China   additive_trend  l[0]  5.03e+10
##  6 China   additive_trend  b[0]  3.29e+ 9
##  7 China   additive_damped alpha 1.00e+ 0
##  8 China   additive_damped beta  5.63e- 1
##  9 China   additive_damped phi   9.80e- 1
## 10 China   additive_damped l[0]  5.03e+10
## 11 China   additive_damped b[0]  3.29e+ 9

chinese_economy_forecasts <- chinese_economy_fits %>%
  forecast(h = 10)

chinese_economy_forecasts %>%
  autoplot(chinese_economy, level = NULL) +
  labs(y="GDP (Billions of USD)", title="Forecast: Chinese GDP (Various ETS Models)")

The method without the trend simply continues the last recorded value. The damped trend method is similar to the regular additive trend, but the impact of the trend is lessened, resulting in lower forecasts.

lambda_1 <- chinese_economy |>
  features(GDP, features = guerrero) |>
  pull(lambda_guerrero)

chinese_economy_transformed <- chinese_economy %>% 
  mutate(GDP_transformed = box_cox(GDP, lambda_1))


chinese_economy_box_cox_ets_fit <- chinese_economy_transformed %>%
   model(no_trend = ETS(GDP_transformed ~ error("A") + trend("N") + season("N")),
        additive_trend = ETS(GDP_transformed ~ error("A") + trend("A") + season("N")),
        additive_damped = ETS(GDP_transformed ~ error("A") + trend("Ad") + season("N")))
  
chinese_economy_box_cox_ets_forecasts <- chinese_economy_box_cox_ets_fit %>%
  forecast(h = 10)

chinese_economy_box_cox_ets_forecasts %>%
  autoplot(chinese_economy_transformed, level = NULL) +
  labs(y="GDP (Billions of USD)", title="Forecast: Chinese GDP (Box Cox Transformation) (Various ETS Models)")

The forecasts on the Box Cox transformed data are closer to each other than the various ETS models are on the non-Box Cox transformed data, because the Box Cox transformation aims to stabilize some of the variance.

7.

Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

aus_production %>%
  autoplot(Gas) +
  labs(y= "Gas Production (Petajoules)", title= "Australian Gas Production, 1956-2010")

aus_production_fits <- aus_production %>%
  model(additive_trend = ETS(Gas ~ error("A") + trend("A") + season("M")),
        additive_damped = ETS(Gas ~ error("A") + trend("Ad") + season("M")))

tidy(aus_production_fits)

## # A tibble: 19 × 3
##    .model          term  estimate
##    <chr>           <chr>    <dbl>
##  1 additive_trend  alpha  0.613  
##  2 additive_trend  beta   0.00786
##  3 additive_trend  gamma  0.224  
##  4 additive_trend  l[0]   3.62   
##  5 additive_trend  b[0]   0.610  
##  6 additive_trend  s[0]   0.980  
##  7 additive_trend  s[-1]  1.15   
##  8 additive_trend  s[-2]  1.08   
##  9 additive_trend  s[-3]  0.797  
## 10 additive_damped alpha  0.610  
## 11 additive_damped beta   0.0249 
## 12 additive_damped gamma  0.223  
## 13 additive_damped phi    0.980  
## 14 additive_damped l[0]   5.64   
## 15 additive_damped b[0]  -0.0618 
## 16 additive_damped s[0]   0.925  
## 17 additive_damped s[-1]  1.06   
## 18 additive_damped s[-2]  1.12   
## 19 additive_damped s[-3]  0.890

aus_production_forecasts <- aus_production_fits %>%
  forecast(h = 4)

aus_production_forecasts %>%
  autoplot(aus_production, level = NULL) +
  labs(y="Gas Production (Petajoules)", title="Forecast: Australian Gas Production (Various ETS Models)")

report(aus_production_fits)

## Warning in report.mdl_df(aus_production_fits): Model reporting is only
## supported for individual models, so a glance will be shown. To see the report
## for a specific model, use `select()` and `filter()` to identify a single model.

## # A tibble: 2 × 9
##   .model          sigma2 log_lik   AIC  AICc   BIC   MSE  AMSE   MAE
##   <chr>            <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 additive_trend    18.2   -899. 1816. 1817. 1847.  17.6  25.1  2.84
## 2 additive_damped   18.5   -901. 1821. 1822. 1855.  17.8  25.9  2.81

Multiplicative seasonality is necessary because the seasonal component appears to vary over time. It is proportional to the level of the series, rather than constant. Making the trend damped does not appear to improve the forecast, as it has a higher AICc than without the damped trend.

8.

Recall your retail time series data (from Exercise 7 in Section 2.10).

set.seed(1989)
my_retail_series <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

autoplot(my_retail_series, Turnover) +
  labs(y= "Turnover (Millions of AUD)", title = "Turnover at Australian Retail Stores")

a.)

Why is multiplicative seasonality necessary for this series?

Multiplicative seasonality is necessary because the seasonal component appears to vary over time. It is proportional to the level, rather than constant.

b.)

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

retail_fit <- my_retail_series %>%
  filter(Month <= yearmonth('2018 Nov')) %>%
  model(holt_winters_multiplicative = ETS(Turnover ~ error("M") + trend("A") + season("M")),
        holt_winters_multiplicative_damped = ETS(Turnover ~ error("M") + trend("Ad") + season("M")))

tidy(retail_fit)

## # A tibble: 35 × 5
##    State              Industry       .model                      term   estimate
##    <chr>              <chr>          <chr>                       <chr>     <dbl>
##  1 Northern Territory Food retailing holt_winters_multiplicative alpha  0.373   
##  2 Northern Territory Food retailing holt_winters_multiplicative beta   0.000158
##  3 Northern Territory Food retailing holt_winters_multiplicative gamma  0.202   
##  4 Northern Territory Food retailing holt_winters_multiplicative l[0]  26.1     
##  5 Northern Territory Food retailing holt_winters_multiplicative b[0]   0.222   
##  6 Northern Territory Food retailing holt_winters_multiplicative s[0]   0.985   
##  7 Northern Territory Food retailing holt_winters_multiplicative s[-1]  0.919   
##  8 Northern Territory Food retailing holt_winters_multiplicative s[-2]  0.928   
##  9 Northern Territory Food retailing holt_winters_multiplicative s[-3]  1.02    
## 10 Northern Territory Food retailing holt_winters_multiplicative s[-4]  0.997   
## # ℹ 25 more rows

retail_forecasts <- retail_fit %>%
  forecast(h = 10)

retail_forecasts %>%
  autoplot(my_retail_series, level = NULL) +
  labs(y="Turnover", title="Forecast: Australian Retail Turnover (Holt-Winters' Multiplicative Methods)")

c.)

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

retail_forecasts_one_step <- retail_fit %>%
  forecast(h = 1)

retail_forecasts_one_step %>%
  accuracy(my_retail_series)

## Warning: 1 error encountered
## [1] subscript out of bounds
## 
## 1 error encountered
## [1] subscript out of bounds

## # A tibble: 2 × 12
##   .model      State Industry .type     ME  RMSE   MAE    MPE  MAPE   MASE  RMSSE
##   <chr>       <chr> <chr>    <chr>  <dbl> <dbl> <dbl>  <dbl> <dbl>  <dbl>  <dbl>
## 1 holt_winte… Nort… Food re… Test  -0.133 0.133 0.133 -0.104 0.104 0.0319 0.0246
## 2 holt_winte… Nort… Food re… Test  -2.85  2.85  2.85  -2.23  2.23  0.685  0.528 
## # ℹ 1 more variable: ACF1 <dbl>

The Holt-Winters Multiplicative method without the damp trend is preferable for the one-step forecast because it has a lower RMSE.

d.)

Check that the residuals from the best method look like white noise.

aug_retail <- my_retail_series |>
  model(ETS(Turnover ~ error("M") + trend("A") + season("M"))) |>
  augment()

autoplot(aug_retail, .innov) +
  labs(y = "$AUD",
       title = "Residuals from the Holt-Winters Multiplicative Method")

The residuals do look like white noise.

e.)

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

my_retail_series_train <- my_retail_series |>
  filter(year(Month) < 2011)

retail_train_set_fit <- my_retail_series_train |>
  model(ETS(Turnover ~ error("M") + trend("A") + season("M")))

fc <- retail_train_set_fit |>
  forecast(new_data = anti_join(my_retail_series, my_retail_series_train))

## Joining with `by = join_by(State, Industry, `Series ID`, Month, Turnover)`

fc |> autoplot(my_retail_series)

retail_train_set_fit |> accuracy()

## # A tibble: 1 × 12
##   State Industry .model .type     ME  RMSE   MAE     MPE  MAPE  MASE RMSSE  ACF1
##   <chr> <chr>    <chr>  <chr>  <dbl> <dbl> <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Nort… Food re… "ETS(… Trai… 0.0390  1.78  1.37 -0.0933  2.47 0.342 0.356 0.199

fc |> accuracy(my_retail_series)

## # A tibble: 1 × 12
##   .model    State Industry .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>     <chr> <chr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(Tur… Nort… Food re… Test   12.1  13.6  12.1  10.1  10.1  3.04  2.72 0.899

Yes, using the Holt-Winters Multiplicative method is an improvement on the SNAIVE model from Section 5.11, since the RMSE’s are lower.

9.

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

lambda_2 <- my_retail_series |>
  features(Turnover, features = guerrero) |>
  pull(lambda_guerrero)

my_retail_series_transformed <- my_retail_series %>% 
  mutate(Turnover_transformed = box_cox(Turnover, lambda_2)) %>%
  select(Month, Turnover_transformed)

print(my_retail_series_transformed)

## # A tsibble: 369 x 2 [1M]
##       Month Turnover_transformed
##       <mth>                <dbl>
##  1 1988 Apr                 2.93
##  2 1988 May                 2.97
##  3 1988 Jun                 3.00
##  4 1988 Jul                 3.01
##  5 1988 Aug                 3.01
##  6 1988 Sep                 3.05
##  7 1988 Oct                 3.02
##  8 1988 Nov                 2.99
##  9 1988 Dec                 3.09
## 10 1989 Jan                 2.97
## # ℹ 359 more rows

dcmp <- my_retail_series_transformed %>% 
  model(STL(Turnover_transformed))

dcmp_components <- dcmp %>%
  components() %>%
  as_tsibble()

print(dcmp_components)

## # A tsibble: 369 x 7 [1M]
## # Key:       .model [1]
##    .model                 Month Turnover_transformed trend season_year remainder
##    <chr>                  <mth>                <dbl> <dbl>       <dbl>     <dbl>
##  1 STL(Turnover_trans… 1988 Apr                 2.93  2.96   -0.0334    0.00728 
##  2 STL(Turnover_trans… 1988 May                 2.97  2.97   -0.000435  0.00104 
##  3 STL(Turnover_trans… 1988 Jun                 3.00  2.97    0.0116    0.0113  
##  4 STL(Turnover_trans… 1988 Jul                 3.01  2.98    0.0493   -0.0195  
##  5 STL(Turnover_trans… 1988 Aug                 3.01  2.99    0.0530   -0.0306  
##  6 STL(Turnover_trans… 1988 Sep                 3.05  3.00    0.0115    0.0371  
##  7 STL(Turnover_trans… 1988 Oct                 3.02  3.01    0.0248   -0.0146  
##  8 STL(Turnover_trans… 1988 Nov                 2.99  3.01   -0.00960  -0.0138  
##  9 STL(Turnover_trans… 1988 Dec                 3.09  3.02    0.0420    0.0245  
## 10 STL(Turnover_trans… 1989 Jan                 2.97  3.03   -0.0600    0.000672
## # ℹ 359 more rows
## # ℹ 1 more variable: season_adjust <dbl>

dcmp_components <- dcmp_components %>%
  select(Month, season_adjust)

stl_train <- dcmp_components |>
  filter(year(Month) < 2011)

Since the seasonally adjusted data has an upward trend with a slight downward trend in recent years, I will apply an ETS model with a damped additive trend.

ets_dcmp_fit <- stl_train %>%
  model(ETS(season_adjust ~ error("A") + trend("Ad") + season("N")))

new_data <- anti_join(dcmp_components, stl_train)

## Joining with `by = join_by(Month, season_adjust)`

print(head(new_data))

## # A tsibble: 6 x 2 [1M]
##      Month season_adjust
##      <mth>         <dbl>
## 1 2011 Jan          3.96
## 2 2011 Feb          3.95
## 3 2011 Mar          3.96
## 4 2011 Apr          3.97
## 5 2011 May          3.96
## 6 2011 Jun          3.96

ets_dcmp_forecast <- ets_dcmp_fit |>
  forecast(new_data = new_data)

ets_dcmp_forecast %>%
  accuracy(dcmp_components)

## # A tibble: 1 × 10
##   .model                   .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>                    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(season_adjust ~ er… Test  0.232 0.248 0.232  5.63  5.63  4.08  3.70 0.957

An RMSE of approximately 0.2479 is lower than any previously encountered, indicating this is the best model we have seen so far.

DATA 624 Week 5 Homework: Exponential Smoothing

Molly Siebecker

2024-10-08

1.

b.)

5.

a.)

b.)

c.)

d.)

e.)

f.)

6.

7.

8.

a.)

b.)

c.)

d.)

e.)

9.