Question 7.1

Consider the pigs series — the number of pigs slaughtered in Victoria each month.

a.

Use the ses() function in R to find the optimal values of \(\alpha\) and \(\iota_0\), and generate forecasts for the next four months.

pigs_ses <- ses(pigs, h = 4)
summary(pigs_ses)

Forecast method: Simple exponential smoothing

Model Information:
Simple exponential smoothing 

Call:
 ses(y = pigs, h = 4) 

  Smoothing parameters:
    alpha = 0.2971 

  Initial states:
    l = 77260.0561 

  sigma:  10308.58

     AIC     AICc      BIC 
4462.955 4463.086 4472.665 

Error measures:
                   ME    RMSE      MAE       MPE     MAPE      MASE       ACF1
Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239

Forecasts:
         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

The ses function generates an \(\alpha\) of 0.2971488 and an \(\iota_0\) of 77260.0561459. The book states that if alpha is small, (close to 0) more weight is given to the observations from distant past. alternatively if alpha is large, more weight is given to more recent observations. In this case, the alpha indicates weight provided to more distant values

b.

Compute a 95% prediction interval for the first forecast using \(\hat{y} \pm\) 1.96 \(s\) where \(s\) is the standard deviation of the residuals.

# Get the first forecast
y_hat <- pigs_ses$mean[1]
# Get the standard deviation of the residuals
s <- sd(pigs_ses$residuals)
lower_ci <- y_hat - 1.96 * s
upper_ci <- y_hat + 1.96 * s

The 95% CI is from 78680 to 118953,

  1. Compare your interval with the interval produced by R.

The CI produced by R is wider

Question 7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

a.

Plot the series and discuss the main features of the data.

autoplot(books) + 
  ggtitle("Daily Sale of Books")+
  defaulttheme

The features of the dataset show an upward trend in booksales overtime for both paperback and hardcover. they seem to seasonally change coincidentally as well. However this is a short trend in data (30 days)

b.

Use the ses() function to forecast each series, and plot the forecasts.

ses_paper <- ses(books[, 1], h = 4)
autoplot(ses_paper) +
  ylab("Paperback Sales")+
  defaulttheme

ses_hard <- ses(books[, 2], h = 4)
autoplot(ses_hard) +
  ylab("Hardback Sales")+
  defaulttheme

c. 

Compute the RMSE values for the training data in each case.

accuracy(ses_paper)
                   ME     RMSE     MAE       MPE     MAPE      MASE       ACF1
Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303 -0.2117522
accuracy(ses_hard)
                   ME     RMSE      MAE      MPE     MAPE      MASE       ACF1
Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887 -0.1417763

The RMSE for the paperback model is 33.64 and 31.93 for the hardback model. This shows that we have slightly better predictions on the paperbacks than on the hardback books however the rmse is still quite large relative to the books sold for each

Question 7.6

We will continue with the daily sales of paperback and hardcover books in data set books.

a.

Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

holt_paper <- holt(books[, 1], h = 4)
summary(holt_paper)

Forecast method: Holt's method

Model Information:
Holt's method 

Call:
 holt(y = books[, 1], h = 4) 

  Smoothing parameters:
    alpha = 0.0001 
    beta  = 0.0001 

  Initial states:
    l = 170.699 
    b = 1.2621 

  sigma:  33.4464

     AIC     AICc      BIC 
318.3396 320.8396 325.3456 

Error measures:
                    ME     RMSE      MAE       MPE     MAPE      MASE
Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
                   ACF1
Training set -0.1750792

Forecasts:
   Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
31       209.4668 166.6035 252.3301 143.9130 275.0205
32       210.7177 167.8544 253.5811 145.1640 276.2715
33       211.9687 169.1054 254.8320 146.4149 277.5225
34       213.2197 170.3564 256.0830 147.6659 278.7735
holt_hard <- holt(books[, 2], h = 4)
autoplot(holt_hard) +
  ylab("Hardback Sales")+
  defaulttheme

summary(holt_hard)

Forecast method: Holt's method

Model Information:
Holt's method 

Call:
 holt(y = books[, 2], h = 4) 

  Smoothing parameters:
    alpha = 0.0001 
    beta  = 0.0001 

  Initial states:
    l = 147.7935 
    b = 3.303 

  sigma:  29.2106

     AIC     AICc      BIC 
310.2148 312.7148 317.2208 

Error measures:
                     ME     RMSE      MAE       MPE    MAPE      MASE
Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
                    ACF1
Training set -0.03245186

Forecasts:
   Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
31       250.1739 212.7390 287.6087 192.9222 307.4256
32       253.4765 216.0416 290.9113 196.2248 310.7282
33       256.7791 219.3442 294.2140 199.5274 314.0308
34       260.0817 222.6468 297.5166 202.8300 317.3334

we can observe the trend in the dataset with Holt’s method as opposed to the SES method. it does not have a seasonal component, however it does do a better job with the direction

b.

Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

Holt’s method does a better job with the predictions than the SES models. Likely due to containing the trend component. as observed in the accuracy computation below, the RMSE is smaller for both the paperback and hardback series. This is expected based on the comparison of the two forecasted plots.

Type SES Holt’s Method
Paperback 33.63769 31.13692
Hardback 31.93101 27.19358

c. 

Compare the forecasts for the two series using both methods. Which do you think is best?

p1 <- autoplot(ses_paper) + ylab("Paperback Sales") + ggtitle("SES Forecast")+defaulttheme
p2 <- autoplot(holt_paper) + ylab("Paperback Sales") + ggtitle("Holt's Forecast")+defaulttheme
grid.arrange(p1, p2, ncol = 1)

p1 <- autoplot(ses_hard) + ylab("Hardback Sales") + ggtitle("SES Forecast")+defaulttheme
p2 <- autoplot(holt_hard) + ylab("Hardback Sales") + ggtitle("Holt's Forecast")+defaulttheme
grid.arrange(p1, p2, ncol = 1)

As stated previously, holts does a better job at predicting the trend under both book sale cases due to the inclusion of the trend component.

d. 

Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

cat("Paperback:", holt_paper$mean[1] - 1.96 * accuracy(holt_paper)[2], "to", holt_paper$mean[1] + 1.96 * accuracy(holt_paper)[2])
Paperback: 148.4384 to 270.4951
cat("Hardback:", holt_hard$mean[1] - 1.96 * accuracy(holt_hard)[2], "to", holt_hard$mean[1] + 1.96 * accuracy(holt_hard)[2])
Hardback: 196.8745 to 303.4733

The 95% CI for the paperback series generated by Holt’s method is 148 to 270. This is wider than the CI computed above. The CI for hardbacks from Holt’s method is 196 to 303

Question 7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.

[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Which model gives the best RMSE?

For the following exercise, we will use various techniques within holts function. there are a few arguments in the holts function that allows us to make several changes. we will keep h at 100 for each. the values we will change include, damped, exponential, and lambda and various combinations of them to see which provide best predition

h <- 100

holt_eggs <- holt(eggs, h=h)
lambda_eggs <- holt(eggs, h=h, lambda=TRUE)
damped_eggs <- holt(eggs, h=h, damped=TRUE)
damped_lambda_eggs <- holt(eggs, h=h, damped=TRUE, lambda=TRUE)
exponential_eggs <- holt(eggs, h=h, exponential=TRUE)
exponential_damped_eggs<-holt(eggs, h=h, exponential=TRUE, damped = TRUE)


p1<-autoplot(holt_eggs) + ggtitle("Holt's Method")+defaulttheme
p2<-autoplot(damped_eggs) + ggtitle("Damped")+defaulttheme
p3<-autoplot(lambda_eggs) + ggtitle("Box-Cox")+defaulttheme
p4<-autoplot(damped_lambda_eggs) + ggtitle("Damped & Box-Cox")+defaulttheme
p5<-autoplot(exponential_eggs) + ggtitle("Exponential")+defaulttheme
p6<-autoplot(exponential_damped_eggs) + ggtitle("Exponential & Damped")+defaulttheme

gridExtra::grid.arrange(p1,p2,p3,p4,p5, p6, ncol = 2)

rmse_extract <- function(model){
  accuracy(model)[2]
}

eggs_model_df<-data.frame()


data.frame(model = (c("Holt's Linear", 
           "Box-Cox Transformed",
           "Damped",
           "Damped and Box-Cox",
           "Exponential",
           "Exponential and Damped")),
          RMSE= c(rmse_extract(holt_eggs), 
          rmse_extract(lambda_eggs),
          rmse_extract(damped_eggs),
          rmse_extract(damped_lambda_eggs),
          rmse_extract(exponential_eggs),
          rmse_extract(exponential_damped_eggs))
          ) %>%   kable() %>%
  kable_styling()
model RMSE
Holt’s Linear 26.58219
Box-Cox Transformed 26.55504
Damped 26.54019
Damped and Box-Cox 26.73445
Exponential 26.49795
Exponential and Damped 26.59113

The most accurate model based on the lowest RMSE is using holt’s method with the exponential trend being fitted. however it should be noted that many of these methods are extremely close together in value with only fractions of an error apart.

Question 7.8

Recall your retail time series data (from Exercise 3 in Section 2.10).

a.

Why is multiplicative seasonality necessary for this series?

With increasing magnitudes overtime, there is a bias towards using multiplicative seasonality inorder to dampen out the increases in fluctuation. below provides this plot

library(httr)
url1<-"https://otexts.com/fpp2/extrafiles/retail.xlsx"
GET(url1, write_disk(tf <- tempfile(fileext = ".xlsx")))
Response [https://otexts.com/fpp2/extrafiles/retail.xlsx]
  Date: 2021-03-15 04:23
  Status: 200
  Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
  Size: 639 kB
<ON DISK>  C:\Users\REGIST~1\AppData\Local\Temp\RtmpKCEC6P\filef944387f3246.xlsx
retaildata  <- readxl::read_excel(tf, skip = 1)


myts <- ts(retaildata[,"A3349873A"],
  frequency=12, start=c(1982,4))

autoplot(myts)+defaulttheme

ggseasonplot(myts, year.labels = T, year.labels.left = T)+defaulttheme

b.

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

hw_multiplicative <- hw(myts, seasonal = "multiplicative")
p1<-autoplot(hw_multiplicative) + 
  ggtitle("Multiplicative") + 
  ylab("Sales")+defaulttheme


hw_multiplicative_damped <- hw(myts, seasonal = "multiplicative", damped = TRUE)
p2<-autoplot(hw_multiplicative_damped) + 
  ggtitle("Multiplicative & Damped") + 
  ylab("Sales")+defaulttheme

grid.arrange(p1,p2, ncol = 1)

c. 

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

cat("RMSE of Multiplicative = ", accuracy(hw_multiplicative)[2])
RMSE of Multiplicative =  13.29378
cat("RMSE of Multiplicative & Damped = ", accuracy(hw_multiplicative_damped)[2])
RMSE of Multiplicative & Damped =  13.30494

For this particular case in the retail data, the non-damped version of the forecast method performes slightly better although this is arguably negligable

d. 

Check that the residuals from the best method look like white noise.

checkresiduals(hw_multiplicative)


    Ljung-Box test

data:  Residuals from Holt-Winters' multiplicative method
Q* = 40.405, df = 8, p-value = 0.000002692

Model df: 16.   Total lags used: 24

the residuals show decent scatter around 0 which is what we expect. there seems to be some autocorrelation between 12/13 months

  1. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?
ts_train <- window(myts, end = c(2010, 12))
ts_test <- window(myts, start = 2011)

train_hw <- hw(ts_train, h=36, seasonal = "multiplicative")
train_hw_damped <- hw(ts_train, h=36, seasonal = "multiplicative", damped = T)
train_snaive <- snaive(ts_train, h=36)

p1 <- autoplot(ts_train) + 
  autolayer(train_hw, series='Multiplicative') +
  autolayer(train_hw_damped, series='Multiplicative with damped trend') + 
  autolayer(train_snaive, series='Seasonal Naive') + 
  autolayer(ts_test,  series='Test set') + 
  theme(legend.position = "top") + 
  ylab("Retail Sales")+defaulttheme

p1

accuracy(train_hw, ts_test)[,2]
Training set     Test set 
    9.107356    94.806617 
accuracy(train_hw_damped, ts_test)[,2]
Training set     Test set 
    8.681456   111.911266 
accuracy(train_snaive, ts_test)[,2]
Training set     Test set 
    20.24576    100.00869 

Very interesting exercise, the RMSE for the training sets always far surpassed the rmse for the test sets indicating a bit of over training. the HW method performed best.

Question 7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

# Box-Cox-STLF transformation
stlf_boxcox_train <- stlf(ts_train, lambda =BoxCox.lambda(ts_train))

#ETS transformation on seasonally
ets_train <- ets(seasadj(decompose(ts_train, "multiplicative")))
accuracy(stlf_boxcox_train)
                    ME     RMSE      MAE        MPE     MAPE      MASE
Training set -0.646515 8.147517 5.855529 -0.3001347 2.885877 0.3669624
                   ACF1
Training set 0.02389584
accuracy(ets_train)
                     ME     RMSE      MAE        MPE     MAPE      MASE
Training set -0.2829213 8.551268 6.201346 -0.2055279 3.091856 0.3886063
                   ACF1
Training set 0.02876641

Generally, the stlf_boxcox model seems to perform better than the ETS model as shown with the lower RMSE on the training dataset