Exercise 7.1

Consider the pigs series — the number of pigs slaughtered in Victoria each month.

a. Use the ses() function in R to find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months.

##         Mar    Apr    May    Jun    Jul    Aug
## 1995 106723  84307 114896 106749  87892 100506
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665 
## 
## Error measures:
##                    ME    RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

b. Compute a 95% prediction interval for the first forecast using \(\hat{y} \pm 1.96s\) where \(s\) is the standard deviation of the residuals. Compare your interval with the interval produced by R.

## [1] 78679.97
## [1] 118952.8

The manually calculated interval is a little narrower than the one calculated by R.

Exercise 7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

a. Plot the series and discuss the main features of the data.

## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
##   Paperback Hardcover
## 1       199       139
## 2       172       128
## 3       111       172
## 4       209       139
## 5       161       191
## 6       119       168
## Time Series:
## Start = 25 
## End = 30 
## Frequency = 1 
##    Paperback Hardcover
## 25       190       214
## 26       182       200
## 27       222       201
## 28       217       283
## 29       188       220
## 30       247       259

We only have 30 days worth of data, so it’s hard to know if there might be any annual or monthly seasonality. If there is any weekly seasonality it’s difficult to see within that period. There does seem to be a pattern in the ACF plots though with almost all of the lags being positive.

Let’s convert the series to weekly seasonality and try plotting again and decomposition.

The decompositions make it look like there is weekly seasonality in both paperback and hardcover book sales, but looking at the polar seasonal plots it’s hard to see that being true.

b. Use the ses() function to forecast each series, and plot the forecasts.

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = paperback, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8271 
## 
##   sigma:  34.8183
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783 
## 
## Error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE       ACF1
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7632792 -0.2117522
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       207.1097 162.4882 251.7311 138.8670 275.3523
## 5.428571       207.1097 161.8589 252.3604 137.9046 276.3147
## 5.571429       207.1097 161.2382 252.9811 136.9554 277.2639
## 5.714286       207.1097 160.6259 253.5935 136.0188 278.2005

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hardcover, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2861 
## 
##   sigma:  33.0517
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542 
## 
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE       ACF1
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.6997539 -0.1417763
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       239.5601 197.2026 281.9176 174.7799 304.3403
## 5.428571       239.5601 194.9788 284.1414 171.3788 307.7414
## 5.571429       239.5601 192.8607 286.2595 168.1396 310.9806
## 5.714286       239.5601 190.8347 288.2855 165.0410 314.0792

c. Compute the RMSE values for the training data in each case.

ME RMSE MAE MPE MAPE MASE ACF1
Training set 7.18 33.64 27.84 0.47 15.58 0.76 -0.21
ME RMSE MAE MPE MAPE MASE ACF1
Training set 9.17 31.93 26.77 2.64 13.39 0.7 -0.14

Exercise 7.6

We will continue with the daily sales of paperback and hardcover books in data set books.

a. Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = paperback, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 170.699 
##     b = 1.2621 
## 
##   sigma:  33.4464
## 
##      AIC     AICc      BIC 
## 318.3396 320.8396 325.3456 
## 
## Error measures:
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.7177104
##                    ACF1
## Training set -0.1750792
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       209.4668 166.6035 252.3301 143.9130 275.0205
## 5.428571       210.7177 167.8544 253.5811 145.1640 276.2715
## 5.571429       211.9687 169.1054 254.8320 146.4149 277.5225
## 5.714286       213.2197 170.3564 256.0830 147.6659 278.7735
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = hardcover, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 147.7935 
##     b = 3.303 
## 
##   sigma:  29.2106
## 
##      AIC     AICc      BIC 
## 310.2148 312.7148 317.2208 
## 
## Error measures:
##                      ME     RMSE      MAE       MPE    MAPE      MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6052024
##                     ACF1
## Training set -0.03245186
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       250.1739 212.7390 287.6087 192.9222 307.4256
## 5.428571       253.4765 216.0416 290.9113 196.2248 310.7282
## 5.571429       256.7791 219.3442 294.2140 199.5274 314.0308
## 5.714286       260.0817 222.6468 297.5166 202.8300 317.3334

b. Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

ME RMSE MAE MPE MAPE MASE ACF1
Training set -3.72 31.14 26.18 -5.51 15.58 0.72 -0.18
ME RMSE MAE MPE MAPE MASE ACF1
Training set -0.14 27.19 23.16 -2.11 12.16 0.61 -0.03

The RMSE for the paperback book sales data using simple exponential smoothing is 33.64 vs 31.14 when using Holt’s method, which is a reduction of 2.5. The RMSE for the hardcover book sales data using simple exponential smoothing is 31.93 vs 27.19 when using Holt’s method, which is a reduction of 4.74. Holt’s method is a better predictor for the books time series since there does seem to be an upward trend in both paperback and hardcover book sales.

d. Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

Differences

ses_pb_conf_int_r ses_hc_conf_int_r holt_pb_conf_int_r holt_hc_conf_int_r
lower -2.312775 -2.195432 -4.525411 -3.952289
upper 2.312775 2.195432 4.525411 3.952289

Once again the confidence intervals that were manually calculated are just a little narrower than the ones calculated by the ses and holt functions in R.

Exercise 7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.

[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Which model gives the best RMSE?

## Holt's method 
## 
## Call:
##  holt(y = eggs, h = 100) 
## 
##   Smoothing parameters:
##     alpha = 0.8124 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 314.7232 
##     b = -2.7222 
## 
##   sigma:  27.1665
## 
##      AIC     AICc      BIC 
## 1053.755 1054.437 1066.472
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.0449909 26.58219 19.18491 -1.142201 9.653791 0.9463626 0.013482

## Damped Holt's method 
## 
## Call:
##  holt(y = eggs, h = 100, damped = TRUE, phi = 0.9) 
## 
##   Smoothing parameters:
##     alpha = 0.8464 
##     beta  = 1e-04 
##     phi   = 0.9 
## 
##   Initial states:
##     l = 297.4547 
##     b = -3.1897 
## 
##   sigma:  27.3608
## 
##      AIC     AICc      BIC 
## 1054.045 1054.727 1066.761
ME RMSE MAE MPE MAPE MASE ACF1
Training set -2.584906 26.62317 19.53231 -2.832104 10.10674 0.9634993 -0.0050198

## Holt's method 
## 
## Call:
##  holt(y = eggs, h = 100, lambda = "auto") 
## 
##   Box-Cox transformation: lambda= 0.3956 
## 
##   Smoothing parameters:
##     alpha = 0.809 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 21.0322 
##     b = -0.1144 
## 
##   sigma:  1.0549
## 
##      AIC     AICc      BIC 
## 443.0310 443.7128 455.7475
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.7736844 26.39376 18.96387 -1.072416 9.620095 0.9354593 0.0388715

## Damped Holt's method 
## 
## Call:
##  holt(y = eggs, h = 100, damped = TRUE, phi = 0.9, lambda = "auto") 
## 
##   Box-Cox transformation: lambda= 0.3956 
## 
##   Smoothing parameters:
##     alpha = 0.8468 
##     beta  = 1e-04 
##     phi   = 0.9 
## 
##   Initial states:
##     l = 20.873 
##     b = 0.125 
## 
##   sigma:  1.0694
## 
##      AIC     AICc      BIC 
## 444.5430 445.2248 457.2595
ME RMSE MAE MPE MAPE MASE ACF1
Training set -2.979642 26.54716 19.2466 -2.932978 10.00465 0.9494056 -0.0019255

fc1_RMSE fc2_RMSE fc3_RMSE fc4_RMSE
26.58219 26.62317 26.39376 26.54716

The Holt model with box-cox transformation and no dampening seems to produce the best forecasts and also has the best (lowest) RMSE. The first model without any dampening or box-cox transformation drops the price into the negative which is impossible. The dampened but untransformed data levels out at it’s current price and stays there forever. The box-cox transformed data slopes down at first but slowly curves upwards so that it levels out at about 0. It also has the narrowest confidence intervals. The box-cox transformed and dampened forecast has confidence intervals that almost never go into the negatives, so in that way it may be the best model, but the price forecast levels out at the current price without change and the upper confidence limit is extremely wide.

Exercise 7.8

Recall your retail time series data (from Exercise 3 in Section 2.10).

a. Why is multiplicative seasonality necessary for this series?

Multiplicative seasonality is needed because you can see in the plot above that the seasonal variation increases with the level of the trend. As the trend increases, so does the seasonal variation.

b. Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

## Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = retail, h = 100, seasonal = "multiplicative") 
## 
##   Smoothing parameters:
##     alpha = 0.3253 
##     beta  = 0.0129 
##     gamma = 0.0255 
## 
##   Initial states:
##     l = 304.256 
##     b = 2.1149 
##     s = 1.0213 0.9379 1.0298 1.1195 1.015 1.0226
##            0.9666 1.0035 0.9715 0.9468 0.9983 0.9673
## 
##   sigma:  0.0275
## 
##      AIC     AICc      BIC 
## 4769.995 4771.681 4837.022
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.9212824 25.20381 18.77683 0.0685623 1.979316 0.3016982 -0.1217931

## Damped Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = retail, h = 100, seasonal = "multiplicative", damped = TRUE,  
## 
##  Call:
##      phi = 0.98) 
## 
##   Smoothing parameters:
##     alpha = 0.271 
##     beta  = 0.0452 
##     gamma = 0.0117 
##     phi   = 0.98 
## 
##   Initial states:
##     l = 304.0294 
##     b = 2.7058 
##     s = 1.0195 0.9319 1.0281 1.1258 1.0214 1.0186
##            0.9747 0.9976 0.9757 0.945 0.9897 0.9717
## 
##   sigma:  0.0276
## 
##      AIC     AICc      BIC 
## 4771.963 4773.649 4838.991
ME RMSE MAE MPE MAPE MASE ACF1
Training set 2.351492 25.28197 18.70815 0.1821815 1.960331 0.3005947 -0.0360754

The RMSE is slightly lower for the first undamped Holt Winter’s model. The plot also seems like a more reasonable forecast. There is no reason to think that the sales will level off the way they do in the damped model.

c. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

fc1_retail_RMSE fc2_retail_RMSE
25.20381 25.28197

The RMSE is very slightly lower for the first model and the plot looks like a more reasonable forecast, so I prefer the first undamped model.

d. Check that the residuals from the best method look like white noise.

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method
## Q* = 250.64, df = 8, p-value < 2.2e-16
## 
## Model df: 16.   Total lags used: 24

The residuals from the best Holt Winter’s Model do look like white noise for the most part although there does seem to be a reduction of the variance over time, so there is some pattern to the residuals that might be a problem.

Just out of curiosity let’s see what a Box-Cox transformation and additive seasonality does to our model.

## Holt-Winters' additive method 
## 
## Call:
##  hw(y = retail, h = 100, seasonal = "additive", lambda = "auto") 
## 
##   Box-Cox transformation: lambda= 0.1939 
## 
##   Smoothing parameters:
##     alpha = 0.2674 
##     beta  = 1e-04 
##     gamma = 0.1601 
## 
##   Initial states:
##     l = 10.5163 
##     b = 0.0189 
##     s = 0.1257 -0.1583 -0.0427 0.5178 0.0588 0.0416
##            -0.1294 -0.0101 -0.0075 -0.1289 -0.0926 -0.1744
## 
##   sigma:  0.0925
## 
##      AIC     AICc      BIC 
## 467.9192 469.6051 534.9467
ME RMSE MAE MPE MAPE MASE ACF1
Training set -0.3742618 25.96042 19.22513 0.0274891 1.918142 0.3089013 -0.0138325

## Damped Holt-Winters' additive method 
## 
## Call:
##  hw(y = retail, h = 100, seasonal = "additive", damped = TRUE,  
## 
##  Call:
##      phi = 0.98, lambda = "auto") 
## 
##   Box-Cox transformation: lambda= 0.1939 
## 
##   Smoothing parameters:
##     alpha = 0.3182 
##     beta  = 0.0222 
##     gamma = 0.1757 
##     phi   = 0.98 
## 
##   Initial states:
##     l = 10.5024 
##     b = 0.0328 
##     s = 0.0096 -0.1515 -0.068 0.5575 0.0821 0.0351
##            -0.0987 -0.0165 -0.0248 -0.1217 -0.0275 -0.1756
## 
##   sigma:  0.095
## 
##      AIC     AICc      BIC 
## 487.1551 488.8410 554.1826
ME RMSE MAE MPE MAPE MASE ACF1
Training set 3.493937 26.69833 19.76242 0.2767205 1.96255 0.3175342 -0.0550526

fc3_retail_RMSE fc4_retail_RMSE
25.96042 26.69833

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' additive method
## Q* = 270.76, df = 8, p-value < 2.2e-16
## 
## Model df: 16.   Total lags used: 24

The plots look better but the RMSE is slightly higher for both models than either of the first two. Also the residual plot looks more like white noise with no discernible pattern or decrease in variability over time. So even though the RMSE is slightly higher, I think I would choose the undamped Holt Winter’s model with additive seasonality and Box-Cox transformation.

e. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

ME RMSE MAE MPE MAPE MASE ACF1 Theil.s.U
Training set 2.658058 25.00649 18.25149 0.2228273 1.965971 0.2958851 -0.0067375 NA
Test set -63.670299 77.04807 65.91346 -2.9161237 3.023249 1.0685599 0.4473749 0.5550438

The RMSE for a Holt Winter’s multiplicative seasonality model is much better at 77.04807 then it was for the Naive approach taken in exercise 3.8 which had an RMSE of 109.62545.

Exercise 7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

## ETS(A,A,N) 
## 
## Call:
##  ets(y = train) 
## 
##   Smoothing parameters:
##     alpha = 0.2394 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 10.444 
##     b = 0.0202 
## 
##   sigma:  0.0823
## 
##      AIC     AICc      BIC 
## 298.8909 299.0678 318.1086

ME RMSE MAE MPE MAPE MASE ACF1 Theil.s.U
Training set 0.0010537 0.0818318 0.0646293 0.0105154 0.4715236 0.2635096 -0.1163447 NA
Test set -0.2236219 0.2536921 0.2251834 -1.2619503 1.2708319 0.9181288 0.7350465 3.30654

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,A,N)
## Q* = 370.11, df = 20, p-value < 2.2e-16
## 
## Model df: 4.   Total lags used: 24
## Training set     Test set 
##     1.084576     1.281013
## ETS(A,A,N) 
## 
## Call:
##  ets(y = boxcox_STL_adj_data) 
## 
##   Smoothing parameters:
##     alpha = 0.2571 
##     beta  = 0.0025 
## 
##   Initial states:
##     l = 10.4664 
##     b = 0.0213 
## 
##   sigma:  0.0809
## 
##      AIC     AICc      BIC 
## 354.0372 354.1972 373.7512

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,A,N)
## Q* = 389.04, df = 20, p-value < 2.2e-16
## 
## Model df: 4.   Total lags used: 24
## [1] 1.08312

Whether we use the entire data set for training or only the data up to the end of 2010, either way the ets function seems to result in an overly optimistic forecast for the retail data. We can see a leveling off that starts in about 2010, but the ets function forecasts that retail sales will continue to rise in an exponential growth pattern. It’s hard to compare the RMSE because I’m not sure I backtransformed it correctly, but if so the RMSE does seem to be significantly smaller at 1.28 for this model than for any of the other previous models that all had RMSE’s above 25.