DATA 624 Homework 5

8.1

Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months.

fit <- aus_livestock %>%
  filter(State == "Victoria",
         Animal == "Pigs") %>%
  model(ETS(Count ~ error("A") + trend("N") + season("N"))) 

fc <- fit %>%
  forecast(h = 4)

fc %>%
  autoplot(aus_livestock) +
  ggtitle("Number of Pigs Slaughtered in Victoria")

report(fit)

## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3221247 
## 
##   Initial states:
##      l[0]
##  100646.6
## 
##   sigma^2:  87480760
## 
##      AIC     AICc      BIC 
## 13737.10 13737.14 13750.07

The optimal values are:

\(\alpha = 0.3221247\)
\(\ell_0 = 100646.6\)

These are the forecasts for the next four months:

fc

## # A fable: 4 x 6 [1M]
## # Key:     Animal, State, .model [1]
##   Animal State   .model                         Month               Count  .mean
##   <fct>  <fct>   <chr>                          <mth>              <dist>  <dbl>
## 1 Pigs   Victor~ "ETS(Count ~ error(\"A\") ~ 2019 Jan  N(95187, 87480760) 95187.
## 2 Pigs   Victor~ "ETS(Count ~ error(\"A\") ~ 2019 Feb  N(95187, 96558142) 95187.
## 3 Pigs   Victor~ "ETS(Count ~ error(\"A\") ~ 2019 Mar N(95187, 105635524) 95187.
## 4 Pigs   Victor~ "ETS(Count ~ error(\"A\") ~ 2019 Apr N(95187, 114712906) 95187.

Compute a 95% prediction interval for the first forecast using \(\hat{y} ± 1.96s\) where \(s\) is the standard deviation of the residuals. Compare your interval with the interval produced by R.

# accuracy(fit)$RMSE
# augment(fit) %>% pull(.resid) %>% sd()

s <- residuals(fit)$.resid %>% sd()
hat_y <- fc$.mean[1]

fc %>% hilo(95) %>% pull('95%') %>% head(1)

## <hilo[1]>
## [1] [76854.79, 113518.3]95

95% prediction interval: (76871.0124775, 113502.1023845)

The difference in intervals is due to the estimate value of the statistic. This interval is slightly smaller than the one produced by R.

8.5

Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

Plot the Exports series and discuss the main features of the data.

global_economy %>%
  filter(Country == "Turkey") %>%
  autoplot(Exports) +
  labs(y="% of GDP", title="Exports: Turkey")

There is an increasing trend with no seasonality. There is a rapid increase around the early 1980s and a small decline in the early and late 1990s. The early 1990s can be attributed to a recession.The last few years seem to increasing and decreasing around the same value.

Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

fit_tk <- global_economy %>%
  filter(Country == "Turkey") %>%
  model(ETS(Exports ~ error("A") + trend("N") + season("N"))) 

fc_tk <- fit_tk %>%
  forecast(h = 5)

fc_tk %>%
  autoplot(global_economy) +
  labs(y="% of GDP", title="Exports: Turkey", subtitle = "ETS(A,N,N)")

Compute the RMSE values for the training data.

stk <- accuracy(fit_tk)$RMSE
stk

## [1] 2.183255

Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

fit_tk_2 <- global_economy %>%
  filter(Country == "Turkey") %>%
  model(ETS(Exports ~ error("A") + trend("A") + season("N")))

fc_tk_2 <- fit_tk_2 %>%
  forecast(h = 5)

fc_tk_2 %>%
  autoplot(global_economy) +
  labs(y="% of GDP", title="Exports: Turkey", subtitle = "ETS(A,A,N)")

stk2 <- accuracy(fit_tk_2)$RMSE
stk2

## [1] 2.144857

The RMSE is lower for the ETS(A,A,N), which suggests that this is model forecasts better.

Compare the forecasts from both methods. Which do you think is best?

I believe that the ETS(A,A,N) method is better since it captures the increasing trend in the data and has a lower RMSE. Meanwhile, the ETS(A,N,N) method suggests that the exports will stay the same.

Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

The confidence interval of the ETS(A,N,N) method:

# accuracy(fit)$RMSE
# augment(fit) %>% pull(.resid) %>% sd()

s_tk <- residuals(fit_tk)$.resid %>% sd()
hat_y_tk <- fc_tk$.mean[1]

fc_tk %>% hilo(95) %>% pull('95%') %>% head(1)

## <hilo[1]>
## [1] [20.14458, 28.85427]95

95% prediction interval: (20.2673177, 28.7315304)
95% prediction interval using RMSE: (20.2202452, 28.778603)

The interval computed by R using hilo() is a slightly larger interval compared to the others. When I use RMSE as the \(s\) I have a slightly larger interval, compared to when I use the standard deviation of the residuals as my \(s\).

The confidence interval of the ETS(A,A,N) method:

# accuracy(fit)$RMSE
# augment(fit) %>% pull(.resid) %>% sd()

s_tk_2 <- residuals(fit_tk_2)$.resid %>% sd()
hat_y_tk_2 <- fc_tk_2$.mean[1]

fc_tk_2 %>% hilo(95) %>% pull('95%') %>% head(1)

## <hilo[1]>
## [1] [20.37251, 29.08602]95

95% prediction interval: (20.4909565, 28.9675735)
95% prediction interval using RMSE: (20.5253448, 28.9331852)

The interval computed by R using hilo() is a slightly larger interval compared to the others. When I use RMSE as the \(s\) I have a smaller interval, compared to when I use the standard deviation of the residuals as my \(s\).

8.6

Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]

The Chinese GDP has an increasing trend, but does not show seasonality. Hence, it would be best to have additive errors and “N” for season.

It was interesting to play with \(\phi\) in the damped trend models as it changed the severity of the forecasts. When I used \(\phi\) as 0.9, the forecasts seemed to be larger. To make it more realistic, I chose \(\phi\) as 0.8 since the GDP has very large numbers. Log and Box-Cox transformations without any damping, seem to over-forecast by a lot as h gets larger. Damped log and Holt’s Method at \(\phi=0.8\) seem to have similar effects. The Damped Holt’s method without any transformation seemed to be midway between the Holt’s and the simple exponential method.

lambda <- global_economy %>%
  filter(Country == "China") %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)

fit_ch <- global_economy %>%
  filter(Country == "China") %>%
  model(`Simple` = ETS(GDP ~ error("A") + trend("N") + season("N")),
        `Holt's method` = ETS(GDP ~ error("A") + trend("A") + season("N")),
        `Damped Holt's method` = ETS(GDP ~ error("A") + trend("Ad", phi = 0.8) + season("N")),
        `Box-Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("A") + season("N")),
        `Box-Cox Damped` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad", phi = 0.8) + season("N")),
        `Log` = ETS(log(GDP) ~ error("A") + trend("A") + season("N")),
        `Log Damped` = ETS(log(GDP) ~ error("A") + trend("Ad", phi = 0.8) + season("N"))
        ) 

fc_ch <- fit_ch %>%
  forecast(h = 15)

fc_ch %>%
  autoplot(global_economy, level = NULL) +
  labs(title="GDP: China") +
  guides(colour = guide_legend(title = "Forecast"))

8.7

Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

Multiplicative seasonality is necessary here because the variation of the seasonal pattern appears to be proportional to the level of the time series. With an increasing trend, the amplitude of the seasonal pattern increases as well.

fit_gas <- aus_production %>%
  model(additive = ETS(Gas ~ error("A") + trend("A") + season("A")),
        multiplicative = ETS(Gas ~ error("M") + trend("A") + season("M")),
        `damped multiplicative` = ETS(Gas ~ error("M") + trend("Ad", phi = 0.9) + season("M"))) 

aus_production %>%
   model(additive = ETS(Gas ~ error("A") + trend("A") + season("A")),
        multiplicative = ETS(Gas ~ error("M") + trend("A") + season("M"))) %>%
   forecast(h=20) %>%
  autoplot(aus_production, level = NULL) +
  labs(title="Australian Gas Production") +
  guides(colour = guide_legend(title = "Forecast"),
         subtitle="Additive vs. Multiplicative Seasonality")

aus_production %>%
  model(multiplicative = ETS(Gas ~ error("M") + trend("A") + season("M")),
        `damped multiplicative` = ETS(Gas ~ error("M") + trend("Ad", phi = 0.9) + season("M"))
        ) %>%
  forecast(h=20) %>%
  autoplot(aus_production, level= NULL) +
  labs(title="Australian Gas Production") +
  guides(colour = guide_legend(title = "Forecast"),
         Subtitle = "Additive vs Damped Trend")

report(fit_gas)

## # A tibble: 3 x 9
##   .model                  sigma2 log_lik   AIC  AICc   BIC   MSE  AMSE    MAE
##   <chr>                    <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 additive              23.6       -927. 1872. 1873. 1903.  22.7  29.7 3.35  
## 2 multiplicative         0.00324   -831. 1681. 1682. 1711.  21.1  32.2 0.0413
## 3 damped multiplicative  0.00340   -835. 1688. 1689. 1719.  21.0  32.4 0.0424

accuracy(fit_gas)

## # A tibble: 3 x 10
##   .model           .type        ME  RMSE   MAE    MPE  MAPE  MASE RMSSE     ACF1
##   <chr>            <chr>     <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>    <dbl>
## 1 additive         Train~  0.00525  4.76  3.35 -4.69  10.9  0.600 0.628  0.0772 
## 2 multiplicative   Train~ -0.115    4.60  3.02  0.199  4.08 0.542 0.606 -0.0131 
## 3 damped multipli~ Train~  0.255    4.58  3.04  0.655  4.15 0.545 0.604 -0.00451

Although, the RMSE did improve with a damped multiplicative method, the AIC worsened slightly with each method. When it comes to whether or not dampening improved the model, you may choose to either base it off the RMSE or the AIC, AICc, and BIC.

Since all three methods have the exact same number of parameters to estimate, the RMSE can be appropriate to compare models. By doing so, it can be seen that the damped multiplicative method was improved slightly.

8.8

Recall your retail time series data (from Exercise 8 in Section 2.10).

Why is multiplicative seasonality necessary for this series?

Multiplicative seasonality is necessary here because the variation of the seasonal pattern appears to be proportional to the level of the time series. Since the trend is increasing over time, the amplitude of the seasonality increases with time as well.

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

set.seed(1234)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1)) 

fit_my <- myseries %>%
  model(multiplicative = ETS(Turnover ~ error("M") + trend("A") + season("M")),
        `damped multiplicative` = ETS(Turnover ~ error("M") + trend("Ad") + season("M"))) 

fit_my %>%
  forecast(h=36) %>%
  autoplot(myseries, level = NULL) +
  labs(title="Australian Retail Turnover") +
  guides(colour = guide_legend(title = "Forecast"))

The damped method seems to be more stagnant, while the multiplicative method increases more in trend. It also produced higher forecasts.

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

accuracy(fit_my) %>%
  select(.model, RMSE)

## # A tibble: 2 x 2
##   .model                 RMSE
##   <chr>                 <dbl>
## 1 multiplicative         1.34
## 2 damped multiplicative  1.36

The RMSE is slightly lower for the non-damped method. The non-damped would be better preferred,

Check that the residuals from the best method look like white noise.

myseries %>%
  model(multiplicative = ETS(Turnover ~ error("M") + trend("A") + season("M"))) %>%
  gg_tsresiduals() +
  ggtitle("Multiplicative Method")

#Box-Pierce test, ℓ=2m for seasonal data, m=12
myseries %>%
  model(multiplicative = ETS(Turnover ~ error("M") + trend("A") + season("M"))) %>%
  augment() %>% 
  features(.innov, box_pierce, lag = 24, dof = 0)

## # A tibble: 1 x 6
##   State   Industry                     .model     bp_stat bp_pvalue .name_repair
##   <chr>   <chr>                        <chr>        <dbl>     <dbl> <chr>       
## 1 Tasman~ Cafes, restaurants and take~ multiplic~    27.2     0.296 minimal

#Ljung-Box test
myseries %>%
  model(multiplicative = ETS(Turnover ~ error("M") + trend("A") + season("M"))) %>%
  augment() %>% 
  features(.innov, ljung_box, lag = 24, dof = 0)

## # A tibble: 1 x 6
##   State   Industry                     .model     lb_stat lb_pvalue .name_repair
##   <chr>   <chr>                        <chr>        <dbl>     <dbl> <chr>       
## 1 Tasman~ Cafes, restaurants and take~ multiplic~    28.0     0.260 minimal

For both tests, the results are not significant at a 0.05 level, which shows that the residuals are not distinguishable from a white noise series. The residuals do look like white noise.

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

myseries_train <- myseries %>%
  filter(year(Month) < 2011)

fit_train <- myseries_train %>%
  model(multi = ETS(Turnover ~ error("M") + trend("A") + season("M")),
        snaive = SNAIVE(Turnover))

#producing forecasts
fc <- fit_train %>%
  forecast(new_data = anti_join(myseries, myseries_train))

fc %>% autoplot(myseries, level = NULL)

accuracy(fit_train) %>%
  select(.type, .model, RMSE)

## # A tibble: 2 x 3
##   .type    .model  RMSE
##   <chr>    <chr>  <dbl>
## 1 Training multi   1.18
## 2 Training snaive  2.90

fc %>% accuracy(myseries)  %>%
  select(.type, .model, RMSE)

## # A tibble: 2 x 3
##   .type .model  RMSE
##   <chr> <chr>  <dbl>
## 1 Test  multi   3.99
## 2 Test  snaive  9.13

The multiplicative method seems to forecast the data better.The RMSE is significantly lower compared to the seasonal naïve approach, which means the multiplicative method is more appropriate.

8.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

#This code is wrong

lambda <- myseries_train %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

#stl decomp applied to the box cox transformed data
myseries_train %>%
  model(
    STL(box_cox(Turnover,lambda) ~ season(window = "periodic"), robust = TRUE)) %>%
  components() %>%
  autoplot() +
  ggtitle("STL with Box-Cox")

#computed the seasonally adjusted data , stl decomp applied to the box cox transformed data
dcmp <- myseries_train %>%
  model(STL_box = STL(box_cox(Turnover,lambda) ~ season(window = "periodic"), robust = TRUE)) %>%
  components()

#replacing turnover with the seasonally adjusted data
myseries_train$Turnover <- dcmp$season_adjust

#modeling on the seasonally adjusted data
fit <- myseries_train %>%
  model(ETS(Turnover ~ error("M") + trend("A") + season("M")))

#checking the residuals
fit %>% gg_tsresiduals()  +
  ggtitle("Residual Plots for Australian Retail Turnover")

#produce forecasts for test data
fc <- fit %>%
  forecast(new_data = anti_join(myseries, myseries_train))

fit %>% accuracy() %>%
  select(.model, .type, RMSE)

fc %>% accuracy(myseries) %>%
  select(.model, .type, RMSE)

lambda <- myseries %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

#stl decomp applied to the box cox transformed data
myseries %>%
  model(STL(box_cox(Turnover,lambda) ~ season(window = "periodic"), robust = TRUE)) %>%
  components() %>%
  autoplot() +
  ggtitle("STL with Box-Cox")


#computed the seasonally adjusted data , stl decomp applied to the box cox transformed data
dcmp <- myseries %>%
  model(STL_box = STL(box_cox(Turnover,lambda) ~ season(window = "periodic"), robust = TRUE)) %>%
  components()

#replacing turnover with the seasonally adjusted data
myseries$Turnover_sa <- dcmp$season_adjust

myseries_train <- myseries %>%
  filter(year(Month) < 2011)

#modeling on the seasonally adjusted data
fit <- myseries_train %>%
  model(ETS(Turnover_sa ~ error("M") + trend("A") + season("M")))

#checking the residuals
fit %>% gg_tsresiduals()  +
  ggtitle("Residual Plots for Australian Retail Turnover")

#Box-Pierce test, ℓ=2m for seasonal data, m=12
myseries %>%
  model(multiplicative = ETS(Turnover_sa ~ error("M") + trend("A") + season("M"))) %>%
  augment() %>% 
  features(.innov, box_pierce, lag = 24, dof = 0)

## # A tibble: 1 x 6
##   State   Industry                     .model     bp_stat bp_pvalue .name_repair
##   <chr>   <chr>                        <chr>        <dbl>     <dbl> <chr>       
## 1 Tasman~ Cafes, restaurants and take~ multiplic~    48.5   0.00220 minimal

#Ljung-Box test
myseries %>%
  model(multiplicative = ETS(Turnover_sa ~ error("M") + trend("A") + season("M"))) %>%
  augment() %>% 
  features(.innov, ljung_box, lag = 24, dof = 0)

## # A tibble: 1 x 6
##   State   Industry                     .model     lb_stat lb_pvalue .name_repair
##   <chr>   <chr>                        <chr>        <dbl>     <dbl> <chr>       
## 1 Tasman~ Cafes, restaurants and take~ multiplic~    49.7   0.00152 minimal

#produce forecasts for test data
fc <- fit %>%
  forecast(new_data = anti_join(myseries, myseries_train))

fit %>% accuracy() %>%
  select(.model, .type, RMSE)

## # A tibble: 1 x 3
##   .model                                                           .type    RMSE
##   <chr>                                                            <chr>   <dbl>
## 1 "ETS(Turnover_sa ~ error(\"M\") + trend(\"A\") + season(\"M\"))" Traini~ 0.122

fc %>% accuracy(myseries) %>%
  select(.model, .type, RMSE)

## # A tibble: 1 x 3
##   .model                                                           .type  RMSE
##   <chr>                                                            <chr> <dbl>
## 1 "ETS(Turnover_sa ~ error(\"M\") + trend(\"A\") + season(\"M\"))" Test  0.288

At first, I applied the box cox transformation on only the training data, but that resulted in an unusually large RMSE on the test data with a smaller RMSE on the train data. The erroneous code is hidden in the chunk above. I was also unsure if whether to do STL decomposition before or after splitting the data, but chose to do the transformations before splitting as the forecasts made more sense.

I applied STL decomposition on the box-cox transformed data and then used the seasonally adjusted data to compute the forecasts using the ETS model. With this model, the residuals are not white noise. However, the RSME is significantly better on the test data. It is 0.2875567, whereas the previous model was 3.987378.

DATA 624 Homework 5

Orli Khaimova

3/13/2022

8.1

8.5

8.6

8.7

8.8

8.9