Week 6 - Exponential Smoothing - Homework

C. Rosemond 10.04.20

library(fpp2)
library(readxl)

7.8.1

Consider the pigs series — the number of pigs slaughtered in Victoria each month.

#?pigs - monthly total number of pigs slaughtered in Victoria, AU (Jan 1980 - Aug 1995)
autoplot(pigs)

a. Use the ses() function in R to find the optimal values of α and ℓ0, and generate forecasts for the next four months.

pigs_ses <- ses(pigs, h = 4)
summary(pigs_ses)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665 
## 
## Error measures:
##                    ME    RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

autoplot(pigs_ses) +
  autolayer(fitted(pigs_ses), series="Fitted") +
  ylab("Number of pigs slaughtered") + xlab("Month")

The ses() function suggests an optimal \(\alpha\) of approximately 0.30 and an initial state l of approximately 77260.06. The point forecast for each of the four forecasted months (September 1995 - December 1995) is approximately 98816.41.

b. Compute a 95% prediction interval for the first forecast using ^y±1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

paste0("95% prediction interval: ", "[", round(pigs_ses$mean[1]-1.96*sd(pigs_ses$residuals),2), ", ", round(pigs_ses$mean[1]+1.96*sd(pigs_ses$residuals),2), "]")

## [1] "95% prediction interval: [78679.97, 118952.84]"

A 95% prediction interval for the first forecast is approximately 78679.97 to 118952.84. This computed interval is slightly narrower than the R-produced interval of approximately 78611.97 to 119020.80.

7.8.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

#?books - daily sales of paperback and hardcover books at the same store

a. Plot the series and discuss the main features of the data.

autoplot(books, main="Sales of paperback and hardcover books") + xlab("Day") + ylab("Books sold")

Daily sales of paperback and hardcover books appear to trend upward over the course of the 30 days. Early in the two series, they tend to share an inverse relationship, i.e. paperback sales are up when hardcover sales are down, and vice versa. The relationship tends to flip every day through day 10, when hardcover sales are higher for roughly 6 days, and then reemerges for a few days. Following day 20, the two series fall into similar sales patterns, with hardcover sales exceeding paperback sales each day.

b. Use the ses() function to forecast each series, and plot the forecasts.

data(books)
paperback_ses <- ses(books[, 1], h=4)
autoplot(paperback_ses, main="Sales of paperback books - SES") +
  autolayer(paperback_ses$fitted, series="Fitted") +
  ylab("Books sold") + xlab("Day")

hardcover_ses <- ses(books[, 2], h=4)
autoplot(hardcover_ses, main="Sales of hardcover books - SES") +
  autolayer(hardcover_ses$fitted, series="Fitted") +
  ylab("Books sold") + xlab("Day")

c. Compute the RMSE values for the training data in each case.

paste0("Paperback SES RMSE: ", round(sqrt(mean(paperback_ses$residuals^2)), 2))

## [1] "Paperback SES RMSE: 33.64"

paste0("Hardcover SES RMSE: ", round(sqrt(mean(hardcover_ses$residuals^2)), 2))

## [1] "Hardcover SES RMSE: 31.93"

7.8.6

We will continue with the daily sales of paperback and hardcover books in data set books.

a. Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

paperback_holt <- holt(books[, 1], h=4)
autoplot(books[, 1], main="Sales of paperback books - Holt's linear method") +
  autolayer(paperback_holt$fitted, series="Fitted") +
  autolayer(paperback_holt, series="Holt's Linear") +
  ylab("Books sold") + xlab("Day")

hardcover_holt <- holt(books[, 2], h = 4)
autoplot(books[, 2], main="Sales of hardcover books - Holt's linear method") +
  autolayer(hardcover_holt$fitted, series="Fitted") +
  autolayer(hardcover_holt, series="Holt's Linear") +
  ylab("Books sold") + xlab("Day")

b. Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

print("Root mean squared error (RMSE)")

## [1] "Root mean squared error (RMSE)"

paste0("Paperback SES: ", round(sqrt(mean(paperback_ses$residuals^2)), 2))

## [1] "Paperback SES: 33.64"

paste0("Paperback Holt's: ", round(sqrt(mean(paperback_holt$residuals^2)), 2))

## [1] "Paperback Holt's: 31.14"

paste0("Hardcover SES: ", round(sqrt(mean(hardcover_ses$residuals^2)), 2))

## [1] "Hardcover SES: 31.93"

paste0("Hardcover Holt's: ", round(sqrt(mean(hardcover_holt$residuals^2)), 2))

## [1] "Hardcover Holt's: 27.19"

Across paperbacks (approximately 31.14 versus 33.64) and hardcovers (approximately 27.19 versus 31.93), the Holt's linear method returns smaller RMSE values. Simple exponential smoothing (SES) is appropriate for series with no clear trend or seasonality, whereas Holt's method extends SES to account for trends. Holt's method seems more appropriate than SES for forecasting each books series given both series trend upward.

c. Compare the forecasts for the two series using both methods. Which do you think is best?

print("Point Forecasts (h = 1-4)")

## [1] "Point Forecasts (h = 1-4)"

paste0("Paperback SES: ", round(paperback_ses$mean, 2))

## [1] "Paperback SES: 207.11" "Paperback SES: 207.11" "Paperback SES: 207.11" "Paperback SES: 207.11"

paste0("Paperback Holt's: ", round(paperback_holt$mean, 2))

## [1] "Paperback Holt's: 209.47" "Paperback Holt's: 210.72" "Paperback Holt's: 211.97" "Paperback Holt's: 213.22"

paste0("Hardcover SES: ", round(hardcover_ses$mean, 2))

## [1] "Hardcover SES: 239.56" "Hardcover SES: 239.56" "Hardcover SES: 239.56" "Hardcover SES: 239.56"

paste0("Hardcover Holt's: ", round(hardcover_holt$mean, 2))

## [1] "Hardcover Holt's: 250.17" "Hardcover Holt's: 253.48" "Hardcover Holt's: 256.78" "Hardcover Holt's: 260.08"

As noted above, Holt's linear method appears better than SES for forecasting these series given the presence of upward trends in sales [and thus the resulting forecasts]. SES forecasts cannot account for trends. Further analysis is necessary regarding whether Holt's linear method is more appropriate than other non-SES methods.

d. Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

paperback_ses

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1097 162.4882 251.7311 138.8670 275.3523
## 32       207.1097 161.8589 252.3604 137.9046 276.3147
## 33       207.1097 161.2382 252.9811 136.9554 277.2639
## 34       207.1097 160.6259 253.5935 136.0188 278.2005

paste0("Paperback SES: ", "[", round(paperback_ses$mean[1]-1.96*sqrt(mean(paperback_ses$residuals^2)),2), ", ", round(paperback_ses$mean[1]+1.96*sqrt(mean(paperback_ses$residuals^2)),2), "]")

## [1] "Paperback SES: [141.18, 273.04]"

The calculated 95% prediction interval for the first forecasted element of paperback series using SES is slightly narrower than its counterpart produced by R.

paperback_holt

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735

paste0("Paperback Holt's: ", "[", round(paperback_holt$mean[1]-1.96*sqrt(mean(paperback_holt$residuals^2)),2), ", ", round(paperback_holt$mean[1]+1.96*sqrt(mean(paperback_holt$residuals^2)),2), "]")

## [1] "Paperback Holt's: [148.44, 270.5]"

The calculated 95% prediction interval for the first forecasted element of the paperback series using Holt's linear method is narrower than its counterpart produced by R.

hardcover_ses

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5601 197.2026 281.9176 174.7799 304.3403
## 32       239.5601 194.9788 284.1414 171.3788 307.7414
## 33       239.5601 192.8607 286.2595 168.1396 310.9806
## 34       239.5601 190.8347 288.2855 165.0410 314.0792

paste0("Hardcover SES: ", "[", round(hardcover_ses$mean[1]-1.96*sqrt(mean(hardcover_ses$residuals^2)),2), ", ", round(hardcover_ses$mean[1]+1.96*sqrt(mean(hardcover_ses$residuals^2)),2), "]")

## [1] "Hardcover SES: [176.98, 302.14]"

The calculated 95% prediction interval for the first forecasted element of the hardcover series using SES is slightly narrower than its counterpart produced by R.

hardcover_holt

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334

paste0("Hardcover Holt's: ", "[", round(hardcover_holt$mean[1]-1.96*sqrt(mean(hardcover_holt$residuals^2)),2), ", ", round(hardcover_holt$mean[1]+1.96*sqrt(mean(hardcover_holt$residuals^2)),2), "]")

## [1] "Hardcover Holt's: [196.87, 303.47]"

The calculated 95% prediction interval for the first forecasted element of the hardcover series using Holt's linear method is narrower than its counterpart produced by R.

7.8.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.

[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Which model gives the best RMSE?

autoplot(eggs)

#?holt()
eggs_holt <- holt(eggs, h=100)
autoplot(eggs, main="Egg prices (1 Dozen) - Holt's linear method") +
  autolayer(eggs_holt$fitted, series="Fitted") +
  autolayer(eggs_holt, series="Holt's Linear") +
  ylab("Price") + xlab("Year")

Holt's linear method, without transformation or dampening, results in forecasts that follow the series trend downward and linearly. They also become negative within roughly 20 years, which is obviously not a realistic price.

eggs_holt_boxcox <- holt(eggs, lambda = "auto", h=100)
autoplot(eggs, main="Egg prices (1 Dozen) - Box-Cox Transformation + Holt's linear method") +
  autolayer(eggs_holt_boxcox$fitted, series="Fitted") +
  autolayer(eggs_holt_boxcox, series="Box-Cox + Holt's Linear") +
  ylab("Price") + xlab("Year")

A Box-Cox transformation alongside Holt's linear method results in a set of forecasts that slopes downward approaching zero. Negative prices are not possible, though a price of approximately free is not particularly realistic either.

eggs_holt_damp <- holt(eggs, damped=TRUE, h=100)
autoplot(eggs, main="Egg prices (1 Dozen) - Damped Holt's linear method ") +
  autolayer(eggs_holt_damp$fitted, series="Fitted") +
  autolayer(eggs_holt_damp, series="Damped Holt's Linear") +
  ylab("Price") + xlab("Year")

Damped Holt's linear method results in an essentially flat forecast, which doesn't reflect the historical trend of the series.

print("Root mean squared error (RMSE)")

## [1] "Root mean squared error (RMSE)"

paste0("Holt's: ", round(sqrt(mean(eggs_holt$residuals^2)), 2))

## [1] "Holt's: 26.58"

paste0("Box-Cox + Holt's: ", round(accuracy(eggs_holt_boxcox)[2], 2))

## [1] "Box-Cox + Holt's: 26.39"

paste0("Damped Holt's: ", round(sqrt(mean(eggs_holt_damp$residuals^2)), 2))

## [1] "Damped Holt's: 26.54"

The combination of a Box-Cox transformation and Holt's linear method results in the lowest RMSE (approximately 26.39).

7.8.8

Recall your retail time series data (from Exercise 3 in Section 2.10).

a. Why is multiplicative seasonality necessary for this series?

retaildata <- read_excel("retail.xlsx", skip=1) # 381 x 19
myts <- ts(retaildata[,4], frequency=12, start=c(1982,4)) # 4th column ("A3349338X")
autoplot(myts, main="Retail - A3349338x")

Seasonality appears to shift multiple times over the course of the series, suggesting multiplicative seasonality is necessary for forecasting.

b. Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

retail_hw <- hw(myts, seasonal="multiplicative")
autoplot(myts, main="Retail - A3349338x") +
  autolayer(retail_hw$fitted, series="Fitted") +
  autolayer(retail_hw, series="Holt-Winters' Multiplicative") +
  ylab("Sales") + xlab("Year")

retail_hw_damp <- hw(myts, seasonal="multiplicative", damped=TRUE)
autoplot(myts, main="Retail - A3349338x") +
  autolayer(retail_hw_damp$fitted, series="Fitted") +
  autolayer(retail_hw_damp, series="Damped HW Multiplicative") +
  ylab("Sales") + xlab("Year")

Visually, dampening the trend does not appear to have a large effect on the forecast.

c. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

accuracy(retail_hw)[2]

## [1] 9.570098

accuracy(retail_hw_damp)[2]

## [1] 9.022512

Statistically, dampening the trend results in a slightly lower RMSE (approximately 9.02 versus 9.57), in which case, the dampened forecast is preferable.

d. Check that the residuals from the best method look like white noise.

checkresiduals(retail_hw_damp)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 41.907, df = 7, p-value = 5.418e-07
## 
## Model df: 17.   Total lags used: 24

The residuals are not clearly white noise. A Ljung-Box test returns a p-value of approximately zero, meaning that there is sufficient evidence that the residuals are different from zero. The residual plots support the test results. Notably, there are relatively large residuals between around 1997 and 2002, and there is significant autocorrelation around lags of 12, 18, and 30 months.

e. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

retail_train <- window(myts, end=c(2010,12))
retail_test <- window(myts, start=2011)

retail_train_hw_damp <- hw(retail_train, method="multiplicative", damped=TRUE)
retail_train_naive <- snaive(retail_train)
accuracy(retail_train_hw_damp, retail_test)

##                      ME     RMSE       MAE       MPE      MAPE      MASE       ACF1 Theil's U
## Training set  0.4837761 10.50095  6.863432 0.2838099  4.001396 0.3549224 0.08324464        NA
## Test set     20.1761150 24.71252 21.043296 9.6757767 10.124165 1.0881928 0.62805178  1.132464

accuracy(retail_train_naive, retail_test)

##                      ME     RMSE      MAE       MPE      MAPE     MASE      ACF1 Theil's U
## Training set   5.502402 27.74935 19.33784  3.369450 10.447161 1.000000 0.8703252        NA
## Test set     -10.845833 24.12202 18.52083 -5.910245  9.201624 0.957751 0.3564215 0.9855325

No, I cannot beat the seasonal naive approach. The test set RMSE for the Damped Holt Winters' Multiplicative model is approximately 24.71 compared with approximately 24.12 for seasonal naive.

7.8.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

retail_ets <- stlf(retail_train, method="ets", lambda="auto")
autoplot(retail_train, main="Retail - A3349338x") +
  autolayer(retail_ets$fitted, series="Fitted") +
  autolayer(retail_ets, series="STL + Box-Cox + ETS") +
  ylab("Sales") + xlab("Year")

accuracy(retail_ets, retail_test)

##                      ME      RMSE      MAE        MPE     MAPE      MASE       ACF1 Theil's U
## Training set -0.3219394  8.319367  5.18686 -0.1652031 2.896367 0.2682234 0.01537319        NA
## Test set     -8.5654693 12.459784 10.04513 -4.3478741 5.054943 0.5194549 0.25063867 0.5963045

The STL + Box-Cox + ETS test RMSE of approximately 12.46 is lower than its counterparts for Damped Holt Winters' Multiplicative (24.71) and seasonal naive (24.12).