HA 7.1, 7.5,7.6,7.7,7.8,7.9

library(GGally)
## Loading required package: ggplot2
library(fpp2)
## Loading required package: forecast
## Loading required package: fma
## 
## Attaching package: 'fma'
## The following object is masked from 'package:GGally':
## 
##     pigs
## Loading required package: expsmooth
library(seasonal)
library(readxl)
library(forecast)

Q 7.1

Consider the pigs series - the number of pigs slaughtered in Victoria each month. Use the ses() function in R to find the optimal values of
?? and ???0, and generate forecasts for the next four months.

fc <- ses(pigs, h=4)
fc[["model"]]
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665
fc
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8
autoplot(pigs) +
  ylab("the number of pigs slaughtered in Victoria ") + xlab("Month")

autoplot(fc) +
  autolayer(fitted(fc), series="Fitted") +
  ylab("the number of pigs slaughtered in Victoria") + xlab("Month")

Compute a 95% prediction interval for the first forecast using ^y±1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

summary(fc)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665 
## 
## Error measures:
##                    ME    RMSE      MAE       MPE     MAPE      MASE
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249
##                    ACF1
## Training set 0.01282239
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8
checkresiduals(fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from Simple exponential smoothing
## Q* = 55.356, df = 22, p-value = 0.0001057
## 
## Model df: 2.   Total lags used: 24

Manually calculated interval

98816.41 + 1.96*10253.6
## [1] 118913.5

FOr the High 95 percent interval i got 118913.5 compared to 119020.8 in R.

Manually calculated interval

98816.41 - 1.96*10253.6
## [1] 78719.35

FOr the low 95 percent interval i got 78719.35 compared to 78611.97 in R.

Q 7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books. Plot the series and discuss the main features of the data.

autoplot(books)+
  ylab("sales of paperback and hardcover books") + xlab("day")

We can clearly see a trend but no clear seasonality

#ggseasonplot(books)
gglagplot(books)

ggAcf(books)

Use the ses() function to forecast each series, and plot the forecasts.

Paperback = books[,1]
Hardcover = books[,2]


fc <- ses(Paperback, h=4)
summary(fc)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = Paperback, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8271 
## 
##   sigma:  34.8183
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783 
## 
## Error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303
##                    ACF1
## Training set -0.2117522
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1097 162.4882 251.7311 138.8670 275.3523
## 32       207.1097 161.8589 252.3604 137.9046 276.3147
## 33       207.1097 161.2382 252.9811 136.9554 277.2639
## 34       207.1097 160.6259 253.5935 136.0188 278.2005
autoplot(fc) +
  autolayer(fitted(fc), series="Fitted") +
  ylab("Paperback sales") + xlab("days")

fc1 <- ses(Hardcover, h=4)
summary(fc1)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = Hardcover, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2861 
## 
##   sigma:  33.0517
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542 
## 
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887
##                    ACF1
## Training set -0.1417763
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5601 197.2026 281.9176 174.7799 304.3403
## 32       239.5601 194.9788 284.1414 171.3788 307.7414
## 33       239.5601 192.8607 286.2595 168.1396 310.9806
## 34       239.5601 190.8347 288.2855 165.0410 314.0792
autoplot(fc1) +
  autolayer(fitted(fc1), series="Fitted") +
  ylab("Hardcover sales") + xlab("days")

Compute the RMSE values for the training data in each case. RSME for Paperback

round(accuracy(fc),2)
##                ME  RMSE   MAE  MPE  MAPE MASE  ACF1
## Training set 7.18 33.64 27.84 0.47 15.58  0.7 -0.21

RSME for Hardcover

round(accuracy(fc1),2)
##                ME  RMSE   MAE  MPE  MAPE MASE  ACF1
## Training set 9.17 31.93 26.77 2.64 13.39  0.8 -0.14

Q 7.6

Now apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

Paperback

fc_holt1 <- holt(Paperback, h=4)
summary(fc_holt1)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = Paperback, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 170.699 
##     b = 1.2621 
## 
##   sigma:  33.4464
## 
##      AIC     AICc      BIC 
## 318.3396 320.8396 325.3456 
## 
## Error measures:
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
##                    ACF1
## Training set -0.1750792
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735
autoplot(fc_holt1) +
  autolayer(fitted(fc_holt1), series="Fitted") +
  ylab("Paperback sales holt") + xlab("days")

Holt Paperback RSME is 31.13692

Hardcover

fc_holt2 <- holt(Hardcover, h=4)
summary(fc_holt2)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = Hardcover, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 147.7935 
##     b = 3.303 
## 
##   sigma:  29.2106
## 
##      AIC     AICc      BIC 
## 310.2148 312.7148 317.2208 
## 
## Error measures:
##                      ME     RMSE      MAE       MPE    MAPE      MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
##                     ACF1
## Training set -0.03245186
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334
autoplot(fc_holt2) +
  autolayer(fitted(fc_holt2), series="Fitted") +
  ylab("Hardcover sales holt") + xlab("days")

RSME of Hardcover Holt is 27.1935

Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous

RSME of Paperback Holt is 31.14 while SES is 33.64 RSME of Hardcover Holt is 27.19 while SES is 31.93

As we can see this is much less than the SES

question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

The holts method uses the beta parameter to add in the trend component for a better estimation. The Holt forecast is more appropriate for this dataset because we see it has a trending component

Compare the forecasts for the two series using both methods. Which do you think is best?

The Holts linear trend method is better because it has better RMSE values

Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt

95% prediction interval for the first forecast for Paperback

checkresiduals(fc_holt1)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 15.081, df = 3, p-value = 0.001749
## 
## Model df: 4.   Total lags used: 7
209.46 + 1.96*31.13
## [1] 270.4748

270.47 is the estimate calculated. SES is 275.0205, Holt is 275.0205. It is interesting that both SES and Holt produce same values

95% prediction interval for the first forecast for Hardcover

checkresiduals(fc_holt2)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt's method
## Q* = 9.416, df = 3, p-value = 0.02424
## 
## Model df: 4.   Total lags used: 7
250.17 + 1.96*27.19
## [1] 303.4624

303.4624 is the estimate calculated for hardcover, SES =304.3403, Holt= 307.4256

These values are not too different from what was predicted with Holt and SES forecast. However, we see they are much closer to the SES predictions

Q 7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900-1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.

[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Which model gives the best RMSE?

fc <- holt(eggs, h=100)
fc2 <- holt(eggs, damped=TRUE, phi = 0.9, h=100)
fc4 <- holt(eggs, damped=TRUE, phi = 0.9, lambda=0.5,  h=100)
fc5 <- holt(eggs, damped=TRUE, phi = 0.95, lambda=0,  h=100)
fc3 <- holt(eggs, damped=FALSE, phi = NULL, h=100)
autoplot(eggs) +
    autolayer(fc4, series="Damped phi 0.9, lambda =0.5", PI=FALSE) +
  autolayer(fc, series="Holt's method", PI=FALSE) +
  autolayer(fc2, series="Damped Holt's method phi 0.9", PI=FALSE) +
  autolayer(fc3, series="Not Damped phi 0.8", PI=FALSE) +
  autolayer(fc4, series="Damped phi 0.9, lambda =0.9", PI=FALSE) +
  autolayer(fc5, series="Damped phi 0.9, lambda =0.95", PI=FALSE) +
  ggtitle("Forecasts from Holt's method") + xlab("Year") +
  ylab("price of dozen eggs in US") +
  guides(colour=guide_legend(title="Forecast"))

round(accuracy(fc),2)
##                ME  RMSE   MAE   MPE MAPE MASE ACF1
## Training set 0.04 26.58 19.18 -1.14 9.65 0.95 0.01
round(accuracy(fc2),2)
##                 ME  RMSE   MAE   MPE  MAPE MASE  ACF1
## Training set -2.58 26.62 19.53 -2.83 10.11 0.96 -0.01
round(accuracy(fc3),2)
##                ME  RMSE   MAE   MPE MAPE MASE ACF1
## Training set 0.04 26.58 19.18 -1.14 9.65 0.95 0.01
round(accuracy(fc4),2)
##                ME RMSE   MAE   MPE  MAPE MASE  ACF1
## Training set -2.7 26.6 19.56 -2.82 10.11 0.96 -0.03
round(accuracy(fc5),2)
##                ME  RMSE  MAE  MPE MAPE MASE ACF1
## Training set -2.1 26.55 19.3 -2.6 9.99 0.95 0.01

Because of two many lines are overlapping in the above graph. I will plot them in two’s below

fc <- holt(eggs, h=100)
fc2 <- holt(eggs, damped=TRUE, phi = 0.9, h=100)

autoplot(eggs) +
    autolayer(fc2, series="Damped phi 0.9", PI=FALSE) +
  autolayer(fc, series="Holt's method", PI=FALSE) +
  
  ylab("price of dozen eggs in US") +
  guides(colour=guide_legend(title="Forecast"))

round(accuracy(fc),2)
##                ME  RMSE   MAE   MPE MAPE MASE ACF1
## Training set 0.04 26.58 19.18 -1.14 9.65 0.95 0.01
round(accuracy(fc2),2)
##                 ME  RMSE   MAE   MPE  MAPE MASE  ACF1
## Training set -2.58 26.62 19.53 -2.83 10.11 0.96 -0.01
fc <- holt(eggs, h=100)
fc2 <- holt(eggs, damped=TRUE, phi = 0.8, h=100)

autoplot(eggs) +
    autolayer(fc2, series="Damped phi 0.8", PI=FALSE) +
  autolayer(fc, series="Holt's method", PI=FALSE) +
  
  ylab("price of dozen eggs in US") +
  guides(colour=guide_legend(title="Forecast"))

round(accuracy(fc),2)
##                ME  RMSE   MAE   MPE MAPE MASE ACF1
## Training set 0.04 26.58 19.18 -1.14 9.65 0.95 0.01
round(accuracy(fc2),2)
##                 ME  RMSE   MAE   MPE MAPE MASE  ACF1
## Training set -3.31 26.66 19.46 -3.09 10.1 0.96 -0.01
fc <- holt(eggs, h=100)
fc2 <- holt(eggs, damped=TRUE, phi = 0.8, lambda=1,  h=100)

autoplot(eggs) +
    autolayer(fc2, series="Damped phi 0.8 lambda =1", PI=FALSE) +
  autolayer(fc, series="Holt's method", PI=FALSE) +
  
  ylab("price of dozen eggs in US") +
  guides(colour=guide_legend(title="Forecast"))

round(accuracy(fc),2)
##                ME  RMSE   MAE   MPE MAPE MASE ACF1
## Training set 0.04 26.58 19.18 -1.14 9.65 0.95 0.01
round(accuracy(fc2),2)
##                 ME  RMSE   MAE   MPE MAPE MASE  ACF1
## Training set -3.31 26.66 19.46 -3.09 10.1 0.96 -0.01
fc <- holt(eggs, h=100)
fc2 <- holt(eggs, damped=FALSE, lambda=0,  h=100)

autoplot(eggs) +
    autolayer(fc2, series="NO Damped lambda =0", PI=FALSE) +
  autolayer(fc, series="Holt's method", PI=FALSE) +
  
  ylab("price of dozen eggs in US") +
  guides(colour=guide_legend(title="Forecast"))

round(accuracy(fc),2)
##                ME  RMSE   MAE   MPE MAPE MASE ACF1
## Training set 0.04 26.58 19.18 -1.14 9.65 0.95 0.01
round(accuracy(fc2),2)
##                ME RMSE   MAE   MPE MAPE MASE ACF1
## Training set 1.46 26.4 19.09 -0.85 9.66 0.94 0.05

From the above looks like the no damping and a box cox transformation of lambda =0 or log of the series yield the best forcast with the lowest RMSE of 26.4

Recall your retail time series data (from Exercise 3 in Section 2.10). Why is multiplicative seasonality necessary for this series?

retaildata <- readxl::read_excel("C:/Users/Mezu/Documents/retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349873A"],
  frequency=12, start=c(1982,4))
autoplot(myts)

myts %>% decompose(type="multiplicative") %>%
  autoplot() + xlab("Year") +
  ggtitle("Classical multiplicative decomposition
    of retail sales index")

The multiplicative method is preferred for this series because the seasonal variations are changing proportional to the level of the series.

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

fit2 <- hw(myts,seasonal="multiplicative")
autoplot(myts) +
  autolayer(fit2, series="HW multiplicative forecasts",
    PI=FALSE) +
  xlab("Year") +
  ylab("Retail sales volume") +
  ggtitle("Retails ") +
  guides(colour=guide_legend(title="Forecast"))

Let’s make the series damped

fit3 <- hw(myts,damped=TRUE,seasonal="multiplicative")
autoplot(myts) +
  autolayer(fit3, series="HW multiplicative forecasts",
    PI=FALSE) +
  xlab("Year") +
  ylab("Retail sales volume") +
  ggtitle("Retails ") +
  guides(colour=guide_legend(title="Forecast"))

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

summary(fit2)
## 
## Forecast method: Holt-Winters' multiplicative method
## 
## Model Information:
## Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = myts, seasonal = "multiplicative") 
## 
##   Smoothing parameters:
##     alpha = 0.504 
##     beta  = 1e-04 
##     gamma = 0.4578 
## 
##   Initial states:
##     l = 62.8715 
##     b = 0.8152 
##     s = 0.9514 0.886 0.9114 1.5529 1.0184 0.9813
##            0.9589 0.9898 0.9593 0.8883 0.9094 0.9929
## 
##   sigma:  0.0513
## 
##      AIC     AICc      BIC 
## 4040.084 4041.770 4107.112 
## 
## Error measures:
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 0.1170648 13.29378 8.991856 -0.1217735 3.918351 0.4748948
##                    ACF1
## Training set 0.08635577
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 2014       390.3784 364.7154 416.0413 351.1303 429.6264
## Feb 2014       391.1995 362.4039 419.9951 347.1605 435.2386
## Mar 2014       427.9732 393.4376 462.5088 375.1555 480.7909
## Apr 2014       394.1500 359.7834 428.5167 341.5908 446.7093
## May 2014       403.4598 365.8492 441.0704 345.9394 460.9802
## Jun 2014       392.3988 353.6036 431.1940 333.0667 451.7309
## Jul 2014       410.9940 368.1710 453.8169 345.5019 476.4860
## Aug 2014       405.6186 361.3056 449.9315 337.8478 473.3893
## Sep 2014       416.5669 369.0509 464.0828 343.8975 489.2362
## Oct 2014       437.9753 385.9982 489.9524 358.4832 517.4674
## Nov 2014       585.8096 513.6953 657.9240 475.5203 696.0990
## Dec 2014       577.7851 504.1964 651.3737 465.2409 690.3292
## Jan 2015       399.6599 342.8992 456.4206 312.8519 486.4679
## Feb 2015       400.4831 342.1250 458.8412 311.2321 489.7341
## Mar 2015       438.1104 372.6939 503.5270 338.0644 538.1564
## Apr 2015       403.4687 341.8115 465.1258 309.1722 497.7652
## May 2015       412.9807 348.4595 477.5019 314.3041 511.6574
## Jun 2015       401.6414 337.5529 465.7300 303.6264 499.6565
## Jul 2015       420.6566 352.1637 489.1496 315.9057 525.4076
## Aug 2015       415.1371 346.2205 484.0538 309.7383 520.5360
## Sep 2015       426.3243 354.2214 498.4272 316.0524 536.5961
## Oct 2015       448.2152 371.0413 525.3891 330.1879 566.2425
## Nov 2015       599.4807 494.4676 704.4937 438.8771 760.0842
## Dec 2015       591.2440 485.9383 696.5497 430.1928 752.2952
summary(fit3)
## 
## Forecast method: Damped Holt-Winters' multiplicative method
## 
## Model Information:
## Damped Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = myts, seasonal = "multiplicative", damped = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.5524 
##     beta  = 2e-04 
##     gamma = 0.4476 
##     phi   = 0.9328 
## 
##   Initial states:
##     l = 62.9106 
##     b = 0.6659 
##     s = 0.8986 0.8635 0.8733 1.5546 1.1214 1.0392
##            1.0033 0.9655 0.9238 0.8886 0.9303 0.9378
## 
##   sigma:  0.0527
## 
##      AIC     AICc      BIC 
## 4055.981 4057.871 4126.952 
## 
## Error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 1.414869 13.30494 9.042151 0.6105987 3.959617 0.4775511
##                    ACF1
## Training set 0.04077895
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 2014       391.3161 364.8883 417.7439 350.8983 431.7339
## Feb 2014       392.2638 361.9876 422.5401 345.9603 438.5673
## Mar 2014       427.6983 391.0174 464.3792 371.5997 483.7969
## Apr 2014       391.6405 354.9948 428.2863 335.5957 447.6853
## May 2014       399.2265 358.9916 439.4614 337.6925 460.7605
## Jun 2014       387.6109 345.9350 429.2868 323.8731 451.3487
## Jul 2014       405.5421 359.3641 451.7201 334.9189 476.1653
## Aug 2014       399.6910 351.7735 447.6085 326.4076 472.9745
## Sep 2014       410.5242 358.9526 462.0958 331.6522 489.3962
## Oct 2014       430.9373 374.4342 487.4405 344.5233 517.3514
## Nov 2014       574.0409 495.7448 652.3370 454.2974 693.7845
## Dec 2014       564.4915 484.6264 644.3565 442.3484 686.6345
## Jan 2015       391.6898 330.2053 453.1743 297.6574 485.7222
## Feb 2015       392.6313 329.2488 456.0139 295.6961 489.5666
## Mar 2015       428.0919 357.1258 499.0579 319.5587 536.6251
## Apr 2015       391.9948 325.3521 458.6376 290.0736 493.9161
## May 2015       399.5818 329.9960 469.1677 293.1595 506.0042
## Jun 2015       387.9507 318.8208 457.0805 282.2257 493.6756
## Jul 2015       405.8924 331.9582 479.8266 292.8198 518.9650
## Aug 2015       400.0316 325.6130 474.4502 286.2182 513.8450
## Sep 2015       410.8695 332.8718 488.8672 291.5822 530.1567
## Oct 2015       431.2954 347.8100 514.7807 303.6156 558.9752
## Nov 2015       574.5123 461.1985 687.8262 401.2138 747.8109
## Dec 2015       564.9500 451.4869 678.4131 391.4232 738.4768

It shows that the RSME’s are almost the same 13.3 but the non damped version has a better RMSE of 13.29. I prefer the the NON damped version because the series seems to going on a trend upwards

Check that the residuals from the best method look like white noise.

checkresiduals(fit2)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method
## Q* = 40.405, df = 8, p-value = 2.692e-06
## 
## Model df: 16.   Total lags used: 24

Yes they do look like white noise

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

Split the data into two parts using

myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)

Check that your data have been split appropriately by producing the following plot.

autoplot(myts) +
  autolayer(myts.train, series="Training") +
  autolayer(myts.test, series="Test")

Calculate forecasts using snaive applied to myts.train.

fc <- snaive(myts.train)

Compare the accuracy of your forecasts against the actual values stored in myts.test

accuracy(fc,myts.test)
##                     ME     RMSE      MAE       MPE      MAPE     MASE
## Training set  7.772973 20.24576 15.95676  4.702754  8.109777 1.000000
## Test set     55.300000 71.44309 55.78333 14.900996 15.082019 3.495907
##                   ACF1 Theil's U
## Training set 0.7385090        NA
## Test set     0.5315239  1.297866

Now let’s compare with the Holt winter method calculate forecasts using holt winter applied to myts.train

fit_train <-  hw(myts.train,seasonal="multiplicative")

compare the accuracy of forecastes against the actual values stored in myts.test

accuracy(fit_train,myts.test)
##                       ME      RMSE       MAE          MPE      MAPE
## Training set  0.03021223  9.107356  6.553533  0.001995484  3.293399
## Test set     55.33771444 70.116586 55.337714 15.173207794 15.173208
##                   MASE       ACF1 Theil's U
## Training set 0.4107058 0.02752875        NA
## Test set     3.4679801 0.35287516   1.30502

We can see here that the RMSE of 70.1 is better than that of the seasonal naive method of chapter 3.

Q 7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

(lambda <- BoxCox.lambda(myts.train))
## [1] 0.1979682
autoplot(BoxCox(myts.train,lambda))

BoxCox(myts.train,lambda) %>%
  mstl() %>%
  autoplot()

BoxCox(myts.train,lambda) %>%
  mstl() -> fit_stl

autoplot(BoxCox(myts.train,lambda), series="Data") +
  autolayer(trendcycle(fit_stl), series="Trend") +
  autolayer(seasadj(fit_stl), series="Seasonally Adjusted") +
  xlab("Year") + ylab("New orders index") +
  ggtitle("Retail sales using STL decomposition") +
  scale_colour_manual(values=c("gray","blue","red"),
             breaks=c("Data","Seasonally Adjusted","Trend"))

Lets’ apply ETS on the seasonal adjusted data

fit_ets <- ets(seasadj(fit_stl))

summary(fit_ets)
## ETS(M,A,N) 
## 
## Call:
##  ets(y = seasadj(fit_stl)) 
## 
##   Smoothing parameters:
##     alpha = 0.6228 
##     beta  = 0.0048 
## 
##   Initial states:
##     l = 6.5203 
##     b = 0.0173 
## 
##   sigma:  0.0115
## 
##      AIC     AICc      BIC 
## 464.4450 464.6219 483.6627 
## 
## Training set error measures:
##                        ME      RMSE        MAE         MPE      MAPE
## Training set -0.007101408 0.1042718 0.08102137 -0.07590578 0.8926668
##                   MASE       ACF1
## Training set 0.3452987 0.01960116
BoxCox(myts.test,lambda) %>%
  mstl() -> fit_stl_test


accuracy(forecast(fit_ets),seasadj(fit_stl_test))
##                        ME      RMSE        MAE         MPE      MAPE
## Training set -0.007101408 0.1042718 0.08102137 -0.07590578 0.8926668
## Test set      0.543429877 0.5812967 0.54342988  4.95118513 4.9511851
##                   MASE       ACF1 Theil's U
## Training set 0.3452987 0.01960116        NA
## Test set     2.3160016 0.72904955  4.387736

This result of this section of transformation with RSME of 0.581 is far better than the previous RMSE. This shows the power of boxcox tranformation and STL decomposition. However, we should note that we can’t just compare the RSME 0.543 to that of the earlier Holt winter method which gave 70.1 rather if we compare the ratio between the training set rsme and test set rsme we see that Holt winter was 7.7 while this case was 5