DATA624 Homework 5 Exponential Smoothing

Assignment: Exercises 7.1, 7.5, 7.6, 7.7, 7.8 and 7.9 from the HA textbook

library(fpp2)

## Loading required package: ggplot2

## Loading required package: forecast

## Loading required package: fma

## Loading required package: expsmooth

7.1 Consider the pigs series — the number of pigs slaughtered in Victoria each month.

a.Use the ses() function in R to find the optimal values of α and ℓ0, and generate forecasts for the next four months.

The optimal value of α is 0.2971 and ℓ0 is 77260.0561. The forecasts for the following four months are also displayed below.

help(pigs)
fc <- ses(pigs, h=4)
summary(fc)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665 
## 
## Error measures:
##                    ME    RMSE      MAE       MPE     MAPE      MASE
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249
##                    ACF1
## Training set 0.01282239
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

Compute a 95% prediction interval for the first forecast using y ± 1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

This interval is very close but slightly narrower than the interval produced by R.

s <- sd(fc$residuals)
lower <- fc$mean[1] - 1.96*s
upper <- fc$mean[1] + 1.96*s
lower

## [1] 78679.97

upper

## [1] 118952.8

7.5 Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

Plot the series and discuss the main features of the data.

There is an upward trend in the sales of both paperback and hardcover books at this store. There are times when there is an inverse relationship between these two types of sales, but for the most part they move in the same direction.

autoplot(books)

Use the ses() function to forecast each series, and plot the forecasts.

The forecasts are the same across the 4 days for each of the categories.

paper.ses <- ses(books[,"Paperback"], h=4) 
hard.ses <- ses(books[,"Hardcover"], h=4)
autoplot(books) + autolayer(paper.ses, series = "Paperback", PI=FALSE) + autolayer(hard.ses, series = "Hardcover", PI=FALSE)

Compute the RMSE values for the training data in each case.

The RMSE for hardcovers is lower, so that forecast is a better fit than the forecast for paperbacks.

accuracy(paper.ses)

##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303
##                    ACF1
## Training set -0.2117522

accuracy(hard.ses)

##                    ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887
##                    ACF1
## Training set -0.1417763

7.6 We will continue with the daily sales of paperback and hardcover books in data set books.

Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

Compared to our SES forecast, Holt’s linear method produces forecasts that are positively upward instead of flat lines since they incorporate a trend component.

paper.holt <- holt(books[,"Paperback"], h=4) 
hard.holt <- holt(books[,"Hardcover"], h=4)
autoplot(books) + autolayer(paper.holt, series = "Paperback", PI=FALSE) + autolayer(hard.holt, series = "Hardcover", PI=FALSE)

Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

The RMSE measures using Holt’s method decreased for both series compared to our RMSE using SES, so these forecasts are a better fit. Holt’s method is able to account for the trend component in these series, which is important as trend is the most apparent characteristic that we found in them when we initially explored them.

accuracy(paper.holt)

##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
##                    ACF1
## Training set -0.1750792

accuracy(hard.holt)

##                      ME     RMSE      MAE       MPE    MAPE      MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
##                     ACF1
## Training set -0.03245186

Compare the forecasts for the two series using both methods. Which do you think is best?

I think the Holt method is best for both of these series as it accounts for the trend and results in lower RMSE values.

Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

For the hardcover series, the calculated interval is narrower than the interval produced using holt which is narrower than the one produced through ses.

hard.s <- sqrt(hard.holt$model$mse)
hard.lower <- hard.holt$mean[1] - 1.96*hard.s
hard.upper <- hard.holt$mean[1] + 1.96*hard.s
hard.lower

## [1] 196.8745

hard.upper

## [1] 303.4733

hard.ses

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5601 197.2026 281.9176 174.7799 304.3403
## 32       239.5601 194.9788 284.1414 171.3788 307.7414
## 33       239.5601 192.8607 286.2595 168.1396 310.9806
## 34       239.5601 190.8347 288.2855 165.0410 314.0792

hard.holt

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334

The same is true for the paperback series.

paper.s <- sqrt(paper.holt$model$mse)
paper.lower <- paper.holt$mean[1] - 1.96*paper.s
paper.upper <- paper.holt$mean[1] + 1.96*paper.s
paper.lower

## [1] 148.4384

paper.upper

## [1] 270.4951

paper.ses

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1097 162.4882 251.7311 138.8670 275.3523
## 32       207.1097 161.8589 252.3604 137.9046 276.3147
## 33       207.1097 161.2382 252.9811 136.9554 277.2639
## 34       207.1097 160.6259 253.5935 136.0188 278.2005

paper.holt

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735

7.7 For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.

[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Which model gives the best RMSE?

After plotting the forecasts using Holt, Holt with damped trend, and Holt with a Box-Cox transformation, it seems that the forecast with the Box-Cox transformation would, at least visually, be the best. In comparing their RMSE, we can confirm that the Box-Cox transformation is the best fit as it has the lowest RMSE.

eggs.holt <- holt(eggs, h=100)
eggs.holt.damped <- holt(eggs, damped = TRUE, h=100)
eggs.holt.boxcox <- holt(eggs, lambda = BoxCox.lambda(eggs), h=100)
autoplot(eggs) + autolayer(eggs.holt, series = "Holt", PI = FALSE) + autolayer(eggs.holt.damped, series = "Holt Damped", PI = FALSE) + autolayer(eggs.holt.boxcox, series = "Holt Box-Cox", PI = FALSE)

accuracy(eggs.holt)

##                      ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 0.04499087 26.58219 19.18491 -1.142201 9.653791 0.9463626
##                    ACF1
## Training set 0.01348202

accuracy(eggs.holt.damped)

##                     ME     RMSE     MAE       MPE     MAPE      MASE
## Training set -2.891496 26.54019 19.2795 -2.907633 10.01894 0.9510287
##                      ACF1
## Training set -0.003195358

accuracy(eggs.holt.boxcox)

##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 0.7736844 26.39376 18.96387 -1.072416 9.620095 0.9354593
##                    ACF1
## Training set 0.03887152

7.8 Recall your retail time series data (from Exercise 3 in Section 2.10).

First, we retrieve the retail series that we previously used and plot it.

retaildata <- readxl::read_excel("retail.xlsx", skip=1) #The second argument (skip=1) is required because the Excel sheet has two header rows.

myts <- ts(retaildata[,"A3349415T"],
  frequency=12, start=c(1982,4))

autoplot(myts) + ggtitle("Australian Retail")

Why is multiplicative seasonality necessary for this series?

Multiplicative seasonality is necessary for this series because seasonal variation increases with the passing of time.

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

The damped method produces a forecast that is lower than the one with the non-damped method.

myts.hw <- hw(myts, seasonal = "multiplicative")
myts.hw.damped <- hw(myts, seasonal = "multiplicative", damped=TRUE)
autoplot(myts) + autolayer(myts.hw, series = "Holt Multiplicative", PI = FALSE) + autolayer(myts.hw.damped, series = "Holt Multiplicative Damped", PI = FALSE)

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

Both forecasts have similar RMSE values but I prefer to use the forecast with the damped method as I would assume that these sales are now trending downwards.

accuracy(myts.hw)

##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set -0.151369 4.772332 3.419357 -0.6941477 5.761196 0.6176838
##                   ACF1
## Training set 0.2402272

accuracy(myts.hw.damped)

##                     ME     RMSE      MAE        MPE     MAPE     MASE
## Training set 0.2893388 4.765953 3.363391 0.09551623 5.817612 0.607574
##                   ACF1
## Training set 0.1913501

Check that the residuals from the best method look like white noise.

These residuals do not look like white noise.

checkresiduals(myts.hw.damped)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 200.39, df = 7, p-value < 2.2e-16
## 
## Model df: 17.   Total lags used: 24

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

The test set RMSE is 6.89 while it was 7.93 when I used the seasonal naive approach. This new approach is thus a better fit, though a bit more complicated.

train <- window(myts, end = c(2010,12))
test <- window(myts, start = 2011)
myts.hw.damped2 <- hw(train, seasonal = "multiplicative", damped = TRUE) 
autoplot(myts.hw.damped2, PI = FALSE)

accuracy(myts.hw.damped2, test)

##                      ME     RMSE      MAE         MPE     MAPE      MASE
## Training set  0.2114936 4.636978 3.248224 -0.09677749 5.971306 0.5891065
## Test set     -4.5627801 6.897783 5.586507 -5.70380935 6.806932 1.0131838
##                   ACF1 Theil's U
## Training set 0.1266117        NA
## Test set     0.1347234 0.3171307

7.9 For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

We can use stlf to decompose a series and forecast the seasonally adjusted series. This results in an RMSE value of 11.76 which is higher than the the RMSE that we got from the Holt-Winters multiplicative method damped (6.89).

myts.stl <- stlf(train, method = 'ets')
autoplot(myts.stl, PI = FALSE)

accuracy(myts.stl, test)

##                       ME      RMSE       MAE         MPE      MAPE
## Training set  -0.1420295  4.032346  2.998777  -0.5162997  5.223302
## Test set     -10.0387852 11.758715 10.609265 -12.1097609 12.707122
##                   MASE      ACF1 Theil's U
## Training set 0.5438662 0.3095471        NA
## Test set     1.9241246 0.2663243 0.5555177

DATA624 Homework 5 Exponential Smoothing

Omar Pineda

3/2/2020