7.1

Consider the pigs series — the number of pigs slaughtered in Victoria each month.

This time series does not seem to follow a trend or have a pronounced seasonality, so simple exponential smoothing is a reasonable model to run.

str(pigs)
##  Time-Series [1:188] from 1980 to 1996: 76378 71947 33873 96428 105084 ...
(pigs) %>% autoplot

Use the ses() function in R to find the optimal values of α and ℓ0, and generate forecasts for the next four months.

The ses() function will optimize values of α (smoothing parameter) and ℓ0 (initial component) by minimizing SSE over periods subject to the constraint that 0 ≤ α ≤ 1. Since there is assumed to be no seasonality or trend, the predictions are the same for each of the four months, ie a flat forecast. The function estimates an optimized value of alpha = 0.2971 and l = 77260. Parameter estimates are based on the minimization of the sum of squared errors of model residuals.

# apply the model 
ses1 <- ses(pigs, h=4)

# get the prections
ses1$mean
##           Sep      Oct      Nov      Dec
## 1995 98816.41 98816.41 98816.41 98816.41
# estimates of alpha and gamma
ses1$model
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665
# visualize the model 
autoplot(pigs) + 
  autolayer(ses1)

**Compute a 95% prediction interval for the first forecast using yhat ± 1.96*s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.**

The intervals produced from fitted values, standard deviation of residuals, and assuming a normal distribution is similar but not identical to those produced by R.

# get the standard deviation
st_dev <- sd(ses1$residuals)

# compute CI 
up <- 98816.41 + 1.96 * st_dev 
low <- 98816.41 -1.96 * st_dev 


c(ses1$upper[1,], 'yhat + 1.96* s'=up)
##            80%            95% yhat + 1.96* s 
##       112027.4       119020.8       118952.8
c(ses1$lower[1,], 'yhat - 1.96* s'=low)
##            80%            95% yhat - 1.96* s 
##       85605.43       78611.97       78679.97

7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

str(books)
##  Time-Series [1:30, 1:2] from 1 to 30: 199 172 111 209 161 119 195 195 131 183 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr [1:2] "Paperback" "Hardcover"

Plot the series and discuss the main features of the data.

Both types of books are increasing, but hardcover is increasing at a faster rate. It is a time series of daily values, with 30 days worth of data.

autoplot(books)

Use the ses() function to forecast each series, and plot the forecasts.

pb <- books[,"Paperback"] %>% ses(h=4)
hc <- books[,"Hardcover"] %>% ses(h=4)

autoplot(books[,"Paperback"]) + 
  autolayer(pb) 

autoplot(books[,"Hardcover"]) + 
  autolayer(hc)

Compute the RMSE values for the training data in each case.

RMSE and other measures of model accuracy can be calculated from the accuracy() function in forecast

round(accuracy(pb),2)[2]
## [1] 33.64
round(accuracy(hc),2)[2]
## [1] 31.93

7.6

Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

Forecasts using Holt’s linear method are provided for paperback and hardcover time series are provided in tabular and graphical forms. Unlike simple exponential smoothing forecasts, Holt’s linear method forecasts are not flat, as trend is accounted for in the calcualtions.

pb_holt <- books[,"Paperback"] %>% holt(h=4)
hc_holt <- books[,"Hardcover"] %>% holt(h=4)

# paperback forecast with Holt
pb_holt
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735
# Hardcover forecast with Holt
hc_holt
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334
# plot Holts linear method predictions
autoplot(books[,"Paperback"]) + 
  autolayer(pb_holt) 

autoplot(books[,"Hardcover"]) + 
  autolayer(hc_holt) 

Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

Not surprisingly, RMSE shows improvement using the Holt trend method compared to simple exponential smoothing. The time series show a clear trend, and since Holt’s method has an additional parameter that accounts for a trend, better fitted values (and likely better predictions) were made.

# paperback RMSE of simple exponential smoothing
round(accuracy(pb),2)[2]
## [1] 33.64
# hardcover RMSE of simple exponential smoothing
round(accuracy(hc),2)[2]
## [1] 31.93
# paperback RMSE of Holt's linear method
round(accuracy(pb_holt),2)[2]
## [1] 31.14
# hardcover RMSE of Holt's linear method
round(accuracy(hc_holt),2)[2]
## [1] 27.19

Compare the forecasts for the two series using both methods. Which do you think is best?

Comparing the two forecasts we can see that Holt’s method gives predictions with higher values due to the introduction of a trend parameter. The question of which forecast is better should be determined by an evaluation of the trend itself. If the increasing trend is true, then Holt’s method will give better predictions. If the trend is not real, of based on expiring conditions, then Holt’s method may give overly optimistic predictions.

# paperback forecast: simple exponential smoothing
pb
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1097 162.4882 251.7311 138.8670 275.3523
## 32       207.1097 161.8589 252.3604 137.9046 276.3147
## 33       207.1097 161.2382 252.9811 136.9554 277.2639
## 34       207.1097 160.6259 253.5935 136.0188 278.2005
# paperback forecast: Holt's method
pb_holt
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735
# hardcover forecast: simple exponential smoothing
hc
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5601 197.2026 281.9176 174.7799 304.3403
## 32       239.5601 194.9788 284.1414 171.3788 307.7414
## 33       239.5601 192.8607 286.2595 168.1396 310.9806
## 34       239.5601 190.8347 288.2855 165.0410 314.0792
# hardcover forecast: Holt's method
hc_holt
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334

Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

As previously discussed, calculating confidence intervals from error terms assuming a normal distribution will give similar, but not equivalent results to those produced in R.

### hardcover
# hardcover CI calculations - Holt
up <- round(accuracy(hc_holt),2)[2] *1.96 + 250.1739
low <- 250.1739 - round(accuracy(hc_holt),2)[2] *1.96

# upper bounds of Holt's method produced by R
hc_holt$upper
## Time Series:
## Start = 31 
## End = 34 
## Frequency = 1 
##         80%      95%
## 31 287.6087 307.4256
## 32 290.9113 310.7282
## 33 294.2140 314.0308
## 34 297.5166 317.3334
# lower bounds of Holt's method produced by R
hc_holt$lower
## Time Series:
## Start = 31 
## End = 34 
## Frequency = 1 
##         80%      95%
## 31 212.7390 192.9222
## 32 216.0416 196.2248
## 33 219.3442 199.5274
## 34 222.6468 202.8300
# hardcover CI calculations - ses
up <- round(accuracy(hc),2)[2] *1.96 + 250.1739
low <- 250.1739 - round(accuracy(hc),2)[2] *1.96

# upper bounds of ses method produced by R
hc$upper
## Time Series:
## Start = 31 
## End = 34 
## Frequency = 1 
##         80%      95%
## 31 281.9176 304.3403
## 32 284.1414 307.7414
## 33 286.2595 310.9806
## 34 288.2855 314.0792
# lower bounds of ses method produced by R
hc$lower
## Time Series:
## Start = 31 
## End = 34 
## Frequency = 1 
##         80%      95%
## 31 197.2026 174.7799
## 32 194.9788 171.3788
## 33 192.8607 168.1396
## 34 190.8347 165.0410
### paperback 
# paperback CI calculations - Holt
up <- round(accuracy(hc_holt),2)[2] *1.96 + 250.1739
low <- 250.1739 - round(accuracy(hc_holt),2)[2] *1.96

# upper bounds of Holt's method produced by R
hc_holt$upper
## Time Series:
## Start = 31 
## End = 34 
## Frequency = 1 
##         80%      95%
## 31 287.6087 307.4256
## 32 290.9113 310.7282
## 33 294.2140 314.0308
## 34 297.5166 317.3334
# lower bounds of Holt's method produced by R
hc_holt$lower
## Time Series:
## Start = 31 
## End = 34 
## Frequency = 1 
##         80%      95%
## 31 212.7390 192.9222
## 32 216.0416 196.2248
## 33 219.3442 199.5274
## 34 222.6468 202.8300
# paperback CI calculations - ses
up <- round(accuracy(hc),2)[2] *1.96 + 250.1739
low <- 250.1739 - round(accuracy(hc),2)[2] *1.96

# upper bounds of ses method produced by R
hc$upper
## Time Series:
## Start = 31 
## End = 34 
## Frequency = 1 
##         80%      95%
## 31 281.9176 304.3403
## 32 284.1414 307.7414
## 33 286.2595 310.9806
## 34 288.2855 314.0792
# lower bounds of ses method produced by R
hc$lower
## Time Series:
## Start = 31 
## End = 34 
## Frequency = 1 
##         80%      95%
## 31 197.2026 174.7799
## 32 194.9788 171.3788
## 33 192.8607 168.1396
## 34 190.8347 165.0410

7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts. [Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

The tuning parameters available in the R exponential smoothing functions can alter the trajectory of models as seen below. Logical consistency needs to be considered - obviously the predicted price of eggs should not fall below 0.

# get structure of time series 
str(eggs)
##  Time-Series [1:94] from 1900 to 1993: 277 315 315 321 315 ...
eggs
## Time Series:
## Start = 1900 
## End = 1993 
## Frequency = 1 
##  [1] 276.79 315.42 314.87 321.25 314.54 317.92 303.39 288.62 292.44 320.92
## [11] 323.38 270.77 301.77 282.99 295.06 276.47 292.80 358.78 345.82 345.42
## [21] 314.10 228.74 215.76 224.67 225.93 250.87 236.24 209.12 237.31 251.67
## [31] 205.36 167.21 150.42 154.09 183.67 246.66 227.58 214.60 208.41 181.21
## [41] 185.67 230.86 266.34 310.29 267.18 303.03 278.46 293.36 283.62 274.26
## [51] 218.12 265.62 226.70 258.00 196.98 213.38 209.17 184.50 192.61 155.83
## [61] 176.32 172.13 161.63 163.00 157.63 154.50 174.28 135.17 141.36 157.83
## [71] 145.65 112.06 106.85 170.91 156.26 140.78 148.35 132.61 115.72 116.07
## [81]  98.76 100.33  89.12  88.67 100.58  76.84  81.10  69.60  64.55  80.36
## [91]  79.79  74.79  64.86  62.27
# probably not a great model, prediction of negative values
mod1 <- eggs %>% holt(h=100)
autoplot(eggs) + 
  autolayer(mod1)

# better, the trend has less influence throughout the duraction of the prediction
mod2 <- eggs %>% holt(h=100, damped = T)
autoplot(eggs) + 
  autolayer(mod2)

# An exponential mode forces the price to remain above 0
mod3 <- eggs %>% holt(h=100, exponential = T)
autoplot(eggs) + 
  autolayer(mod3)

# BoxCox
mod4 <- eggs %>% holt(h=100, lambda = "auto")
autoplot(eggs) + 
  autolayer(mod4)

Which model gives the best RMSE?

In terms of RMSE, the BoxCox transformed Holt model gives the best RMSE.

# holt, no additional parameters
round(accuracy(mod1),2)[2]
## [1] 26.58
# damped Holt
round(accuracy(mod2),2)[2]
## [1] 26.54
# Exponential Holt
round(accuracy(mod3),2)[2]
## [1] 26.5
# Box Cox Holt
round(accuracy(mod4),2)[2]
## [1] 26.39

7.8

Recall your retail time series data (from Exercise 3 in Section 2.10).

#You can read the data into R with the following script:
retaildata <- readxl::read_excel("retail.xlsx", skip=1)

#The second argument (skip=1) is required because the Excel sheet has two header rows.

#Select one of the time series as follows (but replace the column name with your own chosen column):
myts <- ts(retaildata[,"A3349903C"],
  frequency=12, start=c(1982,4))

str(myts)
##  Time-Series [1:381, 1] from 1982 to 2014: 6.6 7 6 6.4 6 6.4 7.4 8 11.7 6.4 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr "A3349903C"

Why is multiplicative seasonality necessary for this series?

The seasonal component of the time series increases with the level.

# plot the time series
autoplot(myts)

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

# Holt-Winters: multiplicative no damping
hw1 <- myts %>% holt(h=100, seasonal="multiplicative")
autoplot(myts) + 
  autolayer(hw1)

# Holt-Winters: multiplicative with  damping
hw2 <- myts %>% holt(h=100, damped=T, seasonal="multiplicative")
autoplot(myts) + 
  autolayer(hw2)

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

The RMSE for the two models are very similar. The damped model seems more appropriate because it is not assuming an increasing linear trend.

round(accuracy(hw1),2)[2]
## [1] 4.25
round(accuracy(hw2),2)[2]
## [1] 4.27

Check that the residuals from the best method look like white noise.

hw2$residuals %>% autoplot()

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

Holt-Winters’ easily outperforms the seasonal naive approach.

# split into train and test sets 
myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)

# run and test seasonal naive model
fc_naive <- snaive(myts.train)
accuracy(fc_naive,myts.test)[2]
## [1] 8.1
# run and test Holt-Winters` model
fc_hw <- myts.train %>% holt(h=10, damped=T, seasonal="multiplicative")
accuracy(fc_hw,myts.test)[2]
## [1] 3.145523

7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

The ETS method, in conjunction with a Box-Cox transformation and STL decomposition, produced the best results of all.

# Box Cox transform the data 
lambda <- BoxCox.lambda(myts)
ts <- BoxCox(myts, lambda)
ts %>% autoplot()

# Perform STL decomposition
stl <- ts[,1] %>%
  stl(t.window=13, s.window="periodic") 
stl %>% autoplot

# get seasonally adjusted component
seasadj <- stl %>% seasadj(stl)
seasadj %>% autoplot

# Make the same test/train split as before
myts.train <- window(seasadj, end=c(2010,12))
myts.test <- window(seasadj, start=2011)

# fit ETS on the seasonally adjusted component and get summary
fit_ets <- ets(myts.train)
summary(fit_ets)
## ETS(A,A,N) 
## 
## Call:
##  ets(y = myts.train) 
## 
##   Smoothing parameters:
##     alpha = 0.4127 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 1.7351 
##     b = 0.0036 
## 
##   sigma:  0.0409
## 
##       AIC      AICc       BIC 
## -182.9640 -182.7870 -163.7463 
## 
## Training set error measures:
##                        ME       RMSE        MAE          MPE     MAPE      MASE
## Training set 0.0002650577 0.04070395 0.03159579 -0.001351856 1.334075 0.4747122
##                    ACF1
## Training set 0.02768323
# make a forecast
fc_ets <- forecast(fit_ets, h=length(myts.test))

# get the RMSE
accuracy(fc_ets, myts.test)[2]
## [1] 0.07027256