DATA 624 Homework 5

Question 7.1
Question 7.5
Question 7.6
Question 7.7
Question 7.8
Question 7.9

Question 7.1

Consider the pigs series — the number of pigs slaughtered in Victoria each month.

Use the ses() function in R to find the optimal values of \(\alpha\) and \(\iota_0\), and generate forecasts for the next four months.

ses_pigs <- ses(pigs, h = 4)
summary(ses_pigs)


Forecast method: Simple exponential smoothing

Model Information:
Simple exponential smoothing 

Call:
 ses(y = pigs, h = 4) 

  Smoothing parameters:
    alpha = 0.2971 

  Initial states:
    l = 77260.0561 

  sigma:  10308.58

     AIC     AICc      BIC 
4462.955 4463.086 4472.665 

Error measures:
                   ME    RMSE      MAE       MPE     MAPE      MASE       ACF1
Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239

Forecasts:
         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

The ses function generates and \(\alpha\) of 0.2971488 and a \(\iota_0\) of 77260.0561459.

Compute a 95% prediction interval for the first forecast using \(\hat{y} \pm\) 1.96 \(s\) where \(s\) is the standard deviation of the residuals.

# Get the first forecast
y_hat <- ses_pigs$mean[1]
# Get the standard deviation of the residuals
s <- sd(ses_pigs$residuals)
lower_ci <- y_hat - 1.96 * s
upper_ci <- y_hat + 1.96 * s

The 95% CI is from 78680 to 118953,

Compare your interval with the interval produced by R.

The 95% CI produced by R is 78612 to 119021.R’s CI is slightly wider than the CI we computed.

Question 7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

Plot the series and discuss the main features of the data.

autoplot(books) + 
  ggtitle("Daily Sale of Books")

The series has an upwards trend. There is only 30 days of data so I can’t really speak to an seasonality or weekly effects with much confidence. The paperback sales lagging hardback, or hardback and paperback being counter cyclical, but I would stress it’s really too soon to make these claims.

Use the ses() function to forecast each series, and plot the forecasts.

ses_paperback <- ses(books[, 1], h = 4)
autoplot(ses_paperback) +
  ylab("Paperback Book Sales")

ses_hardback <- ses(books[, 2], h = 4)
autoplot(ses_hardback) +
  ylab("Hardback Book Sales")

Compute the RMSE values for the training data in each case.

accuracy(ses_paperback)

                   ME     RMSE     MAE       MPE     MAPE      MASE       ACF1
Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303 -0.2117522

accuracy(ses_hardback)

                   ME     RMSE      MAE      MPE     MAPE      MASE       ACF1
Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887 -0.1417763

The RMSE for the paperback model is 33.64 and 31.93 for the hardback model.

Question 7.6

We will continue with the daily sales of paperback and hardcover books in data set books.

Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

holt_paperback <- holt(books[, 1], h = 4)
summary(holt_paperback)


Forecast method: Holt's method

Model Information:
Holt's method 

Call:
 holt(y = books[, 1], h = 4) 

  Smoothing parameters:
    alpha = 0.0001 
    beta  = 0.0001 

  Initial states:
    l = 170.699 
    b = 1.2621 

  sigma:  33.4464

     AIC     AICc      BIC 
318.3396 320.8396 325.3456 

Error measures:
                    ME     RMSE      MAE       MPE     MAPE      MASE
Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
                   ACF1
Training set -0.1750792

Forecasts:
   Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
31       209.4668 166.6035 252.3301 143.9130 275.0205
32       210.7177 167.8544 253.5811 145.1640 276.2715
33       211.9687 169.1054 254.8320 146.4149 277.5225
34       213.2197 170.3564 256.0830 147.6659 278.7735

holt_hardback <- holt(books[, 2], h = 4)
autoplot(holt_hardback) +
  ylab("Hardback Book Sales")

summary(holt_hardback)


Forecast method: Holt's method

Model Information:
Holt's method 

Call:
 holt(y = books[, 2], h = 4) 

  Smoothing parameters:
    alpha = 0.0001 
    beta  = 0.0001 

  Initial states:
    l = 147.7935 
    b = 3.303 

  sigma:  29.2106

     AIC     AICc      BIC 
310.2148 312.7148 317.2208 

Error measures:
                     ME     RMSE      MAE       MPE    MAPE      MASE
Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
                    ACF1
Training set -0.03245186

Forecasts:
   Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
31       250.1739 212.7390 287.6087 192.9222 307.4256
32       253.4765 216.0416 290.9113 196.2248 310.7282
33       256.7791 219.3442 294.2140 199.5274 314.0308
34       260.0817 222.6468 297.5166 202.8300 317.3334

Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

Holt’s method does a better job than the SES models, because the RMSE is smaller for both the paperback and hardback series. This is understandable because Holt’s includes a trend component. SES assumes there is no trend. This does not really fit these timeseries.

Type	SES	Holt’s Method
Paperback	33.63769	31.13692
Hardback	31.93101	27.19358

Compare the forecasts for the two series using both methods. Which do you think is best?

p1 <- autoplot(ses_paperback) + ylab("Paperback Book Sales") + ggtitle("SES Forecast")
p2 <- autoplot(holt_paperback) + ylab("Paperback Book Sales") + ggtitle("Holt's Forecast")
grid.arrange(p1, p2, ncol = 2)

p1 <- autoplot(ses_hardback) + ylab("Hardback Book Sales") + ggtitle("SES Forecast")
p2 <- autoplot(holt_hardback) + ylab("Hardback Book Sales") + ggtitle("Holt's Forecast")
grid.arrange(p1, p2, ncol = 2)

I think Holt’s method is a better forecast. I like that factors in the trend.

Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

cat("Paperback:", holt_paperback$mean[1] - 1.96 * accuracy(holt_paperback)[2], "to", holt_paperback$mean[1] + 1.96 * accuracy(holt_paperback)[2])

Paperback: 148.4384 to 270.4951

cat("Hardback:", holt_hardback$mean[1] - 1.96 * accuracy(holt_hardback)[2], "to", holt_hardback$mean[1] + 1.96 * accuracy(holt_hardback)[2])

Hardback: 196.8745 to 303.4733

The 95% CI for the paperback series generated by Holt’s method is 143.9130 to 275.0205. This is wider than the CI computed above. The CI for hardbacks from Holt’s method is 192.9222 to 307.4256, which again is wider than the CI above.

Question 7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.

[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Which model gives the best RMSE?

h <- 100

holt_eggs <- holt(eggs, h=h)
box_cox_eggs <- holt(eggs, h=h, lambda=TRUE)
damped_eggs <- holt(eggs, h=h, damped=TRUE)
damped_box_cox_eggs <- holt(eggs, h=h, damped=TRUE, lambda=TRUE)
exponential_eggs <- holt(eggs, h=h, exponential=TRUE)

autoplot(holt_eggs) + ggtitle("Holt's Method")

autoplot(damped_eggs) + ggtitle("Damped")

autoplot(box_cox_eggs) + ggtitle("Box-Cox")

autoplot(damped_box_cox_eggs) + ggtitle("Damped & Box-Cox")

autoplot(exponential_eggs) + ggtitle("Exponential")

get_rmse <- function(model){
  accuracy(model)[2]
}

Model <- c("Holt's Linear", 
           "Box-Cox Transformed",
           "Damped",
           "Damped and Box-Cox",
           "Exponential")

RMSE <- c(get_rmse(holt_eggs), 
          get_rmse(box_cox_eggs),
          get_rmse(damped_eggs),
          get_rmse(damped_box_cox_eggs),
          get_rmse(exponential_eggs))

eggs_rmse_df <- data.frame(Model, RMSE) 

eggs_rmse_df %>%
  kable() %>%
  kable_styling()

Model	RMSE
Holt’s Linear	26.58219
Box-Cox Transformed	26.55504
Damped	26.54019
Damped and Box-Cox	26.73445
Exponential	26.49795

The model with the lowest (best) RMSE is the Exponential model.

Question 7.8

Recall your retail time series data (from Exercise 3 in Section 2.10).

Why is multiplicative seasonality necessary for this series?

As we have seen in the previous sections, the variability in the series is increasing over time. This calls for a method that accounts for multiplicative seasonality. Here’s the plot in case you have forgotten what the time series looks like:

retaildata <- read_excel("retail.xlsx", skip = 1)
retail_ts <- ts(retaildata[, "A3349873A"], frequency = 12, start = c(1982, 4))
autoplot(retail_ts) + ylab("Retail Sales")

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

hw_multiplicative <- hw(retail_ts, seasonal = "multiplicative")
autoplot(hw_multiplicative) + 
  ggtitle("Multiplicative") + 
  ylab("Retail Sales")

summary(hw_multiplicative)


Forecast method: Holt-Winters' multiplicative method

Model Information:
Holt-Winters' multiplicative method 

Call:
 hw(y = retail_ts, seasonal = "multiplicative") 

  Smoothing parameters:
    alpha = 0.504 
    beta  = 0.0001 
    gamma = 0.4578 

  Initial states:
    l = 62.8715 
    b = 0.8152 
    s = 0.9514 0.886 0.9114 1.5529 1.0184 0.9813
           0.9589 0.9898 0.9593 0.8883 0.9094 0.9929

  sigma:  0.0513

     AIC     AICc      BIC 
4040.084 4041.770 4107.112 

Error measures:
                    ME     RMSE      MAE        MPE     MAPE      MASE
Training set 0.1170648 13.29378 8.991856 -0.1217735 3.918351 0.4748948
                   ACF1
Training set 0.08635577

Forecasts:
         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
Jan 2014       390.3784 364.7154 416.0413 351.1303 429.6264
Feb 2014       391.1995 362.4039 419.9951 347.1605 435.2386
Mar 2014       427.9732 393.4376 462.5088 375.1555 480.7909
Apr 2014       394.1500 359.7834 428.5167 341.5908 446.7093
May 2014       403.4598 365.8492 441.0704 345.9394 460.9802
Jun 2014       392.3988 353.6036 431.1940 333.0667 451.7309
Jul 2014       410.9940 368.1710 453.8169 345.5019 476.4860
Aug 2014       405.6186 361.3056 449.9315 337.8478 473.3893
Sep 2014       416.5669 369.0509 464.0828 343.8975 489.2362
Oct 2014       437.9753 385.9982 489.9524 358.4832 517.4674
Nov 2014       585.8096 513.6953 657.9240 475.5203 696.0990
Dec 2014       577.7851 504.1964 651.3737 465.2409 690.3292
Jan 2015       399.6599 342.8992 456.4206 312.8519 486.4679
Feb 2015       400.4831 342.1250 458.8412 311.2321 489.7341
Mar 2015       438.1104 372.6939 503.5270 338.0644 538.1564
Apr 2015       403.4687 341.8115 465.1258 309.1722 497.7652
May 2015       412.9807 348.4595 477.5019 314.3041 511.6574
Jun 2015       401.6414 337.5529 465.7300 303.6264 499.6565
Jul 2015       420.6566 352.1637 489.1496 315.9057 525.4076
Aug 2015       415.1371 346.2205 484.0538 309.7383 520.5360
Sep 2015       426.3243 354.2214 498.4272 316.0524 536.5961
Oct 2015       448.2152 371.0413 525.3891 330.1879 566.2425
Nov 2015       599.4807 494.4676 704.4937 438.8771 760.0842
Dec 2015       591.2440 485.9383 696.5497 430.1928 752.2952

hw_multiplicative_damped <- hw(retail_ts, seasonal = "multiplicative", damped = TRUE)
autoplot(hw_multiplicative_damped) + 
  ggtitle("Multiplicative & Damped") + 
  ylab("Retail Sales")

summary(hw_multiplicative_damped)


Forecast method: Damped Holt-Winters' multiplicative method

Model Information:
Damped Holt-Winters' multiplicative method 

Call:
 hw(y = retail_ts, seasonal = "multiplicative", damped = TRUE) 

  Smoothing parameters:
    alpha = 0.5524 
    beta  = 0.0002 
    gamma = 0.4476 
    phi   = 0.9328 

  Initial states:
    l = 62.9106 
    b = 0.6659 
    s = 0.8986 0.8635 0.8733 1.5546 1.1214 1.0392
           1.0033 0.9655 0.9238 0.8886 0.9303 0.9378

  sigma:  0.0527

     AIC     AICc      BIC 
4055.981 4057.871 4126.952 

Error measures:
                   ME     RMSE      MAE       MPE     MAPE      MASE       ACF1
Training set 1.414869 13.30494 9.042151 0.6105987 3.959617 0.4775511 0.04077895

Forecasts:
         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
Jan 2014       391.3161 364.8883 417.7439 350.8983 431.7339
Feb 2014       392.2638 361.9876 422.5401 345.9603 438.5673
Mar 2014       427.6983 391.0174 464.3792 371.5997 483.7969
Apr 2014       391.6405 354.9948 428.2863 335.5957 447.6853
May 2014       399.2265 358.9916 439.4614 337.6925 460.7605
Jun 2014       387.6109 345.9350 429.2868 323.8731 451.3487
Jul 2014       405.5421 359.3641 451.7201 334.9189 476.1653
Aug 2014       399.6910 351.7735 447.6085 326.4076 472.9745
Sep 2014       410.5242 358.9526 462.0958 331.6522 489.3962
Oct 2014       430.9373 374.4342 487.4405 344.5233 517.3514
Nov 2014       574.0409 495.7448 652.3370 454.2974 693.7845
Dec 2014       564.4915 484.6264 644.3565 442.3484 686.6345
Jan 2015       391.6898 330.2053 453.1743 297.6574 485.7222
Feb 2015       392.6313 329.2488 456.0139 295.6961 489.5666
Mar 2015       428.0919 357.1258 499.0579 319.5587 536.6251
Apr 2015       391.9948 325.3521 458.6376 290.0736 493.9161
May 2015       399.5818 329.9960 469.1677 293.1595 506.0042
Jun 2015       387.9507 318.8208 457.0805 282.2257 493.6756
Jul 2015       405.8924 331.9582 479.8266 292.8198 518.9650
Aug 2015       400.0316 325.6130 474.4502 286.2182 513.8450
Sep 2015       410.8695 332.8718 488.8672 291.5822 530.1567
Oct 2015       431.2954 347.8100 514.7807 303.6156 558.9752
Nov 2015       574.5123 461.1985 687.8262 401.2138 747.8109
Dec 2015       564.9500 451.4869 678.4131 391.4232 738.4768

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

cat("RMSE of Multiplicative = ", accuracy(hw_multiplicative)[2])

RMSE of Multiplicative =  13.29378

cat("RMSE of Multiplicative & Damped = ", accuracy(hw_multiplicative_damped)[2])

RMSE of Multiplicative & Damped =  13.30494

The non-damped model is preforming better.

Check that the residuals from the best method look like white noise.

checkresiduals(hw_multiplicative)


    Ljung-Box test

data:  Residuals from Holt-Winters' multiplicative method
Q* = 40.405, df = 8, p-value = 0.000002692

Model df: 16.   Total lags used: 24

The residuals looks like white noise to me. They are normally distributed with a mean of zero, with no pattern over time.

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

retail_train <- window(retail_ts, end = c(2010, 12))
retail_test <- window(retail_ts, start = 2011)

get_retail_test_rmse <- function(model){
  accuracy(model, retail_test)[4]
}

Model <- c("Seasonal Naïve (Baseline)", 
           "SES",
           "Holt's Method",
           "Damped Holt's Method",
           "Holt-Winters Additive",
           "Holt-Winters Multiplicative",
           "Damped Holt-Winters Additive",
           "Damped Holt-Winters Multiplicative")

RMSE <- c(get_retail_test_rmse(snaive(retail_train)), 
          get_retail_test_rmse(ses(retail_train)),
          get_retail_test_rmse(holt(retail_train)),
          get_retail_test_rmse(holt(retail_train, damped = TRUE)),
          get_retail_test_rmse(hw(retail_train, seasonal = "additive")),
          get_retail_test_rmse(hw(retail_train, seasonal = "multiplicative")),
          get_retail_test_rmse(hw(retail_train, seasonal = "additive", damped = TRUE)),
          get_retail_test_rmse(hw(retail_train, seasonal = "multiplicative", damped = TRUE)))

rmse_df <- data.frame(Model, RMSE) 

rmse_df %>%
  kable() %>%
  kable_styling()

Model	RMSE
Seasonal Naïve (Baseline)	71.44309
SES	20.35353
Holt’s Method	24.18842
Damped Holt’s Method	19.86123
Holt-Winters Additive	77.41974
Holt-Winters Multiplicative	70.11659
Damped Holt-Winters Additive	78.15090
Damped Holt-Winters Multiplicative	81.94650

This was and interesting exercise. I decided to run it through a buch of models, even some that didn’t make sense given this data set. Some of the more sophisticated approaches preformed less well on the test set than the seasonal naïve model. In the end the Damped Holt’s Method preformed the best on the test set.

Question 7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

# Window doesn't return a ts object so we need to redefine
retail_train <- ts(as.vector(retail_ts), start=c(1982,4), end=c(2010,12), frequency = 12)
# Get the optimal lambda
lambda <- BoxCox.lambda(retail_train)
# Preform a Box-Cox transformation
bc_retail_train <- BoxCox(retail_train, lambda = lambda)
# Preform the STL decomposition
stl_retail_train <- mstl(bc_retail_train)
## Create a seasonally adjusted series
sa_stl_retail_train <- stl_retail_train[,1] - stl_retail_train[,3]
# Reverse the Box-Cox transformation
sa_retail_train <- InvBoxCox(sa_stl_retail_train, lambda = lambda)
# Check the output
autoplot(retail_train) + autolayer(sa_retail_train)

ets_retail_train <- forecast(sa_retail_train)

rmse_df %>%
  mutate(Model = as.character(Model)) %>%
  rbind(c("ETS on STL Seasonally-Adjusted data", get_retail_test_rmse(ets_retail_train))) %>%
  kable() %>%
  kable_styling()

Model	RMSE
Seasonal Naïve (Baseline)	71.443089238918
SES	20.3535314695857
Holt’s Method	24.1884222733178
Damped Holt’s Method	19.8612256038912
Holt-Winters Additive	77.4197424359606
Holt-Winters Multiplicative	70.1165860648465
Damped Holt-Winters Additive	78.150902954432
Damped Holt-Winters Multiplicative	81.946499394325
ETS on STL Seasonally-Adjusted data	90.7167828424442

The ETS model on the transformed data preformed worse than the Seasonal Naïve model. Again a simpler approach can be a better one.