DATA624: Homework 5

library(fpp3)
## -- Attaching packages ---------------------------------------------- fpp3 0.5 --
## v tibble      3.1.6     v tsibble     1.1.3
## v dplyr       1.0.7     v tsibbledata 0.4.1
## v tidyr       1.1.4     v feasts      0.3.0
## v lubridate   1.8.0     v fable       0.3.2
## v ggplot2     3.3.5     v fabletools  0.3.2
## -- Conflicts ------------------------------------------------- fpp3_conflicts --
## x lubridate::date()    masks base::date()
## x dplyr::filter()      masks stats::filter()
## x tsibble::intersect() masks base::intersect()
## x tsibble::interval()  masks lubridate::interval()
## x dplyr::lag()         masks stats::lag()
## x tsibble::setdiff()   masks base::setdiff()
## x tsibble::union()     masks base::union()

Task

Do exercises 8.1, 8.5, 8.6, 8.7, 8.8, 8.9 in Hyndman. Please submit both the link to your Rpubs and the .rmd file.

Exercises

8.1

Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

  1. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of alpha and level 0, and generate forecasts for the next four months.

  2. Compute a 95% prediction interval for the first forecast using y +/- 1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

Part A

The alpha is 0.322 and the level 0 is 100647. The 4 month forecast is below.

victoria.pigs <- aus_livestock %>%
  filter(State == "Victoria", Animal == "Pigs")

fit <- victoria.pigs %>%
  model(M1 = ETS(Count ~ error("A") + trend("N") + season("N")))


fc <- fit %>%
  forecast(h = 4)
fc %>%
autoplot(victoria.pigs)

fit %>% coef()
## # A tibble: 2 x 5
##   Animal State    .model term    estimate
##   <fct>  <fct>    <chr>  <chr>      <dbl>
## 1 Pigs   Victoria M1     alpha      0.322
## 2 Pigs   Victoria M1     l[0]  100647.

Part B

  • Calculated Prediction Interval: [76854.452, 113518.663]
  • Predicted Prediction Interval:[76854.7889, 113518.32597]
report(fit)
## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3221247 
## 
##   Initial states:
##      l[0]
##  100646.6
## 
##   sigma^2:  87480760
## 
##      AIC     AICc      BIC 
## 13737.10 13737.14 13750.07
sd <- sqrt(87480760)

upper <- round(fc$.mean[1] + 1.96 * sd, 3)
lower <- round(fc$.mean[1] - 1.96 * sd, 3)

model.ci <- fc %>% hilo(95) %>% pull('95%') %>% head(1)
paste0("Calculated Prediction Interval: [", lower, ", ", upper, "]")
## [1] "Calculated Prediction Interval: [76854.452, 113518.663]"
paste0("Predicted Prediction Interval:", model.ci)
## [1] "Predicted Prediction Interval:[76854.7888896402, 113518.325972343]95"

8.5

Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

  1. Plot the Exports series and discuss the main features of the data.

  2. Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

  3. Compute the RMSE values for the training data.

  4. Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

  5. Compare the forecasts from both methods. Which do you think is best?

  6. Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

Part A

The US exports have an upward trend from 1960 to 2016. Notable drops in exports during the 1980s and late 2000s, reflecting a recession occurred.

us.econ <- global_economy %>%
  filter(Country == "United States")

us.econ <- us.econ[-58,]

autoplot(us.econ, Exports)

Part B

fit <- us.econ %>%
  model(PartB = ETS(Exports ~ error("A") + trend("N") + season("N")))


fc <- fit %>%
  forecast(h = 4)


fc %>%
autoplot(us.econ) +
  ggtitle("ETS(A, N, N) Forecast")

Part C

The RMSE is 0.627 for the model.

accuracy(fit)
## # A tibble: 1 x 11
##   Country       .model .type       ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <fct>         <chr>  <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 United States PartB  Training 0.125 0.627 0.465  1.37  5.10 0.990 0.992 0.239

Part D

The RMS for the ETS(A, A, N) is slightly better than that of the ETS(A, N, N) model. Generally, simple exponential smoothing (ETS(A, N, N)) is a better fitting model when the data has no clear trend or seasonal pattern. The Holt’s Linear Trend method (ETS(A, A, N)) contains another optimizable equation to account for trends in the data. Therefore the ETS(A, A, N) model is preferred.

fit2 <- us.econ %>%
  model(PartD = ETS(Exports ~ error("A") + trend("A") + season("N")))


fc2 <- fit2 %>%
  forecast(h = 4)


fc2 %>%
autoplot(us.econ) +
  ggtitle("ETS(A, A, N) Forecast")

rbind(data.frame(accuracy(fit)), data.frame(accuracy(fit2)))
##         Country .model    .type         ME      RMSE       MAE         MPE
## 1 United States  PartB Training 0.12500385 0.6270672 0.4650723  1.37073229
## 2 United States  PartD Training 0.01075622 0.6149566 0.4659441 -0.05696615
##       MAPE      MASE     RMSSE      ACF1
## 1 5.096552 0.9900993 0.9921329 0.2387874
## 2 5.188477 0.9919554 0.9729717 0.2375941

Part E

As mentioned above, the model created in part D accounts for the trend in the data. US exports appear to be trending upward, therefore I would select the ETS(A, A, N) model for this timeseries.

fit3 <- us.econ %>%
  model(PartB = ETS(Exports ~ error("A") + trend("N") + season("N")),
        PartD = ETS(Exports ~ error("A") + trend("A") + season("N")))


fc3 <- fit3 %>%
  forecast(h = 4)


fc3 %>%
autoplot(us.econ) +
  ggtitle("Part B and Part D Forecast Comparison")

Part F

The calculated prediction intervals for both models are very similar to the one produced in R.

  • Part B: Calculated Prediction Interval: [10.639, 13.142]
  • Part B: Predicted Prediction Interval:[10.6395079134527, 13.1418592559602]
  • Part D: Calculated Prediction Interval: [10.757, 13.257]
  • Part D: Predicted Prediction Interval:[10.7566652294566, 13.2565616708691]
glance(fit3)
## # A tibble: 2 x 10
##   Country       .model sigma2 log_lik   AIC  AICc   BIC   MSE  AMSE   MAE
##   <fct>         <chr>   <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 United States PartB   0.408   -88.6  183.  184.  189. 0.393 0.989 0.465
## 2 United States PartD   0.407   -87.5  185.  186.  195. 0.378 0.903 0.466
sd <- sqrt(0.4075120)

upper <- round(fc$.mean[1] + 1.96 * sd, 3)
lower <- round(fc$.mean[1] - 1.96 * sd, 3)

model.ci <- fc %>% hilo(95) %>% pull('95%') %>% head(1)
sd2 <- sqrt(0.4067128)

upper2 <- round(fc2$.mean[1] + 1.96 * sd2, 3)
lower2 <- round(fc2$.mean[1] - 1.96 * sd2, 3)

model.ci2 <- fc2 %>% hilo(95) %>% pull('95%') %>% head(1)
paste0("Part B: Calculated Prediction Interval: [", lower, ", ", upper, "]")
## [1] "Part B: Calculated Prediction Interval: [10.639, 13.142]"
paste0("Part B: Predicted Prediction Interval:", model.ci)
## [1] "Part B: Predicted Prediction Interval:[10.6395079134527, 13.1418592559602]95"
paste0("Part D: Calculated Prediction Interval: [", lower2, ", ", upper2, "]")
## [1] "Part D: Calculated Prediction Interval: [10.757, 13.257]"
paste0("Part D: Predicted Prediction Interval:", model.ci2)
## [1] "Part D: Predicted Prediction Interval:[10.7566652294566, 13.2565616708691]95"

8.6

Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]

  • Alpha effects the level of the forecast. Low alpha means more weight is applied to older observations. High alpha means that more weight is given to newer observations
  • Beta effects the slope of the forecast. Low beta means more weight is applied to older observations. High beta means that more weight is given to newer observations
  • Phi effects the severity of the damping effect. Low phi caues the damping effect to occur sooner. High phi pushes the damping effect further into the future.
  • Boxcox seems to provide an exponential forecast that is much higher than the other methods.
alpha = 0.1
beta = 0.1
phi = 0.1



china.data <- global_economy %>%
  filter(Country == "China")

lambda <- china.data %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)



fit1 <- china.data %>%
  model(`Simple Exponential Smoothing` = ETS(GDP ~ error("A") + trend("N", alpha = alpha) + season("N")),
        `Holts method` = ETS(GDP ~ error("A") + trend("A", alpha = alpha, beta = beta) + season("N")),
        `Damped Holt's method` = ETS(GDP ~ error("A") + trend("Ad", alpha = alpha, beta = beta, phi = phi) + season("N")),
        `Box Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("A", alpha = alpha, beta = beta) + season("N")),
        `Damped Box Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad", alpha = alpha, beta = beta, phi = phi) + season("N"))) %>%
  forecast(h = 10) %>%
  autoplot(china.data) +
  ggtitle("China Data")


fit1

alpha = 0.5
beta = 0.5
phi = 0.5



china.data <- global_economy %>%
  filter(Country == "China")

lambda <- china.data %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)



fit2 <- china.data %>%
  model(`Simple Exponential Smoothing` = ETS(GDP ~ error("A") + trend("N", alpha = alpha) + season("N")),
        `Holts method` = ETS(GDP ~ error("A") + trend("A", alpha = alpha, beta = beta) + season("N")),
        `Damped Holt's method` = ETS(GDP ~ error("A") + trend("Ad", alpha = alpha, beta = beta, phi = phi) + season("N")),
        `Box Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("A", alpha = alpha, beta = beta) + season("N")),
        `Damped Box Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad", alpha = alpha, beta = beta, phi = phi) + season("N"))) %>%
  forecast(h = 10) %>%
  autoplot(china.data) +
  ggtitle("China Data")


fit2

alpha = 0.9
beta = 0.9
phi = 0.9



china.data <- global_economy %>%
  filter(Country == "China")

lambda <- china.data %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)


fit3 <- china.data %>%
  model(`Simple Exponential Smoothing` = ETS(GDP ~ error("A") + trend("N", alpha = alpha) + season("N")),
        `Holts method` = ETS(GDP ~ error("A") + trend("A", alpha = alpha, beta = beta) + season("N")),
        `Damped Holt's method` = ETS(GDP ~ error("A") + trend("Ad", alpha = alpha, beta = beta, phi = phi) + season("N")),
        `Box Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("A", alpha = alpha, beta = beta) + season("N")),
        `Damped Box Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad", alpha = alpha, beta = beta, phi = phi) + season("N"))) %>%
  forecast(h = 10) %>%
  autoplot(china.data) +
  ggtitle("China Data")


fit3

8.7

Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

Multiplicative seasonality is necessary because the seasonality increases as over time. Additive would be preferred if the seasonality variability was fairly constant quarter after quarter. Both the Multiplicative and Damped Multiplicative models performed almost identical based on the plot and accuracy metrics. I would say that the damped effec does not improve the forecasts within the next few years.

aus_production <- aus_production


fit <- aus_production %>%
  model(Multiplicative = ETS(Gas ~ error("M") + trend("A") + season("M")),
        `Damped Multiplicative` = ETS(Gas ~ error("M") + trend("Ad") + season("M")))


fc <- fit %>%
  forecast(h = 16)
fc %>%
autoplot(aus_production)

data.frame(accuracy(fit))
##                  .model    .type           ME     RMSE      MAE       MPE
## 1        Multiplicative Training -0.114886465 4.595113 3.021727 0.1987614
## 2 Damped Multiplicative Training -0.004385562 4.591840 3.031478 0.3258961
##       MAPE      MASE     RMSSE        ACF1
## 1 4.082508 0.5420365 0.6059856 -0.01306290
## 2 4.104484 0.5437856 0.6055540 -0.02170345

8.8

Recall your retail time series data (from Exercise 8 in Section 2.10).

  1. Why is multiplicative seasonality necessary for this series?

  2. Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

  3. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

  4. Check that the residuals from the best method look like white noise.

  5. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

Part A

The variation in the seasonality changes over time, therefore a multiplicative seasonality component is preferred.

set.seed(15)

myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

autoplot(myseries, Turnover)

Part B

The damped method appears to provide consistently lower projections than its non-damped counterpart model.

fit <- myseries %>%
  model(`Holt Winters Multiplicative` = ETS(Turnover ~ error("M") + trend("A") + season("M")),
        `Damped Multiplicative` = ETS(Turnover ~ error("M") + trend("Ad") + season("M")))


fc <- fit %>%
  forecast(h = 20)

fc %>%
  autoplot(myseries)

Part C

The Damped Multiplicative model has a lowest RMSE, therefore it fits the training data the best. I would prefer using the Damped Multiplicative model as it appears to be a better fit for the timeseries.

accuracy(fit)
## # A tibble: 2 x 12
##   State  Industry .model .type    ME  RMSE   MAE    MPE  MAPE  MASE RMSSE   ACF1
##   <chr>  <chr>    <chr>  <chr> <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 New S~ Furnitu~ Holt ~ Trai~ 0.759  11.3  7.96 0.0164  4.15 0.524 0.534 0.190 
## 2 New S~ Furnitu~ Dampe~ Trai~ 0.949  11.1  7.83 0.204   4.08 0.515 0.524 0.0261

Part D

The innovation residuals plot shows many spikes in the data, meaning that the residual variance fluctuates. Many lags are outside the blue interval, pointing toward autocorrelation. The histogram appears to be near normal wih a handful of outliers on each side of the mean. The statistical tests run below both reject the null hypothesis (pvalue < 0.05), indicating the possibility of non-zero autocorrelation within the first 24 lags. The residuals appear to not b white noise, indicating that there is most likely a more superior model to use to forecast this data.

fit <- myseries %>%
  model(`Damped Multiplicative` = ETS(Turnover ~ error("M") + trend("Ad") + season("M")))

fit %>%
  gg_tsresiduals()

myseries %>% features(Turnover, box_pierce, lag = 24)
## # A tibble: 1 x 4
##   State           Industry                                     bp_stat bp_pvalue
##   <chr>           <chr>                                          <dbl>     <dbl>
## 1 New South Wales Furniture, floor coverings, houseware and t~   7963.         0
myseries %>% features(Turnover, ljung_box, lag = 24)
## # A tibble: 1 x 4
##   State           Industry                                     lb_stat lb_pvalue
##   <chr>           <chr>                                          <dbl>     <dbl>
## 1 New South Wales Furniture, floor coverings, houseware and t~   8219.         0

Part E

Both exponential smoothing approaches produced a better forecast than the naive approach. The Damped Multiplicative model was only a slightly better fit than the naive method, whereas the Holts-Winters Multiplicative model performed the best by far on the test data.

train.data <- myseries %>%
  filter(year(Month) <= 2010)

test.data <- myseries %>%
  filter(year(Month) > 2010)

fit <- train.data %>%
  model(`Holt Winters Multiplicative` = ETS(Turnover ~ error("M") + trend("A") + season("M")),
        `Damped Multiplicative` = ETS(Turnover ~ error("M") + trend("Ad") + season("M")),
        `Seasonal Naive` = SNAIVE(Turnover))

fc <- fit %>%
  forecast(test.data)


fc %>%
  autoplot(myseries, level = NULL)

fc %>% accuracy(test.data)
## # A tibble: 3 x 12
##   .model  State  Industry  .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>   <chr>  <chr>     <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Damped~ New S~ Furnitur~ Test   57.5  71.4  58.5 14.4  14.7    NaN   NaN 0.895
## 2 Holt W~ New S~ Furnitur~ Test   12.4  24.3  19.8  2.78  5.21   NaN   NaN 0.686
## 3 Season~ New S~ Furnitur~ Test   66.9  80.9  67.5 16.9  17.1    NaN   NaN 0.902

8.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

I was unable to successfully compare the test set results from the STL model to the other models, instead I will compare the training set results. The STL decomposition provided the best RMSE out of all of the models. The both boxcox transformation models were more effective than the non-transformed model. A combination of model complexity and data manipulation is most likely to produce an optimal forecasting result. It should be noted that boxcox transformation effects the readability of the model results, making it less practical in certain situations. It is interesting to view the forecast vs the actual timeseries for the test data. The best fit given the train data is the simple boxcox though the true best model is the complex boxcox model.

lambda <- train.data %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

stl.model <- train.data %>%
model(`STL Boxcox` = STL(box_cox(Turnover, lambda)))

fit <- train.data %>%
  model(`Simple ETS Boxcox` =ETS(box_cox(Turnover, lambda)),
        `Complex ETS Boxcox` =ETS(box_cox(Turnover, lambda) ~ error("M") + trend("A") + season("M")),
        `Non Transformed Holt Winters Multiplicative` = ETS(Turnover ~ error("M") + trend("A") + season("M")))
stl.model %>%
  components() %>%
  autoplot()

fc <- fit %>%
  forecast(test.data)


fc %>%
  autoplot(myseries, level = NULL)

rbind(accuracy(stl.model), accuracy(fit))
## # A tibble: 4 x 12
##   State  Industry    .model  .type      ME  RMSE   MAE     MPE  MAPE  MASE RMSSE
##   <chr>  <chr>       <chr>   <chr>   <dbl> <dbl> <dbl>   <dbl> <dbl> <dbl> <dbl>
## 1 New S~ Furniture,~ STL Bo~ Trai~  0.103   6.88  4.92 -0.148   3.17 0.370 0.374
## 2 New S~ Furniture,~ Simple~ Trai~  1.41    9.71  6.54  0.659   4.16 0.491 0.529
## 3 New S~ Furniture,~ Comple~ Trai~  0.253  10.3   7.13 -0.0143  4.56 0.536 0.563
## 4 New S~ Furniture,~ Non Tr~ Trai~ -0.0511 10.2   7.02 -0.378   4.52 0.528 0.556
## # ... with 1 more variable: ACF1 <dbl>

Error received when trying to get test data forecasting for stl model.

fc <- stl.model %>%
  forecast(test.data)
## Error in `mutate_cols()`:
## ! Problem with `mutate()` column `STL Boxcox`.
## i `STL Boxcox = (function (object, ...) ...`.
## x no applicable method for 'forecast' applied to an object of class "stl_decomposition"
## Caused by error in `UseMethod()`:
## ! no applicable method for 'forecast' applied to an object of class "stl_decomposition"