Assignment 5

library(fpp2)
library(mlbench) 
library(corrplot)
library(ggplot2)
require(gridExtra)
library(car)
library(caret)
library(tidyverse)
library(DT)
library(plotly)

Question 7.1

Consider the pigs series — the number of pigs slaughtered in Victoria each month.

Use the ses() function in R to find the optimal values of α and ℓ0, and generate forecasts for the next four months.

Optimal values:

alpha = 0.2971
l = 77260.0561

summary(ses(pigs, 4))

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665 
## 
## Error measures:
##                    ME    RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

autoplot(ses(pigs)) + autolayer(fitted(ses(pigs)), series = "Fitted")

Compute a 95% prediction interval for the first forecast using
\(\hat{y} +- 1.96s\) where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

print(paste("R Lower bound:", ses(pigs, 4)$lower[1, "95%"]))

## [1] "R Lower bound: 78611.9684577467"

print(paste("R Upper bound:", ses(pigs, 4)$upper[1, "95%"]))

## [1] "R Upper bound: 119020.843765435"

print(paste("Calculated Lower bound:", 98816.41  - (sd(ses(pigs, 4)$residuals) * 1.96)))

## [1] "Calculated Lower bound: 78679.9711418255"

print(paste("Calculated Upper bound:", 98816.41  + (sd(ses(pigs, 4)$residuals) * 1.96)))

## [1] "Calculated Upper bound: 118952.848858174"

This interval is slightly more narrow than the interval produced in R.

Question 7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

Plot the series and discuss the main features of the data.

Both paperback and hardcover book sales have an upwards trend
Paperback sales appear to cycle every 3-4 days
Hardcover sales appear to cycle every 10 days

autoplot(books, facets=TRUE)

Use the ses() function to forecast each series, and plot the forecasts.

books[,1]

## Time Series:
## Start = 1 
## End = 30 
## Frequency = 1 
##  [1] 199 172 111 209 161 119 195 195 131 183 143 141 168 201 155 243 225 167 237
## [20] 202 186 176 232 195 190 182 222 217 188 247

paperback <- books[,1]
hardcover <- books[,2]

grid.arrange(
autoplot(ses(paperback)) + autolayer(fitted(ses(paperback)), series = "Fitted Paperback"),
autoplot(ses(hardcover)) + autolayer(fitted(ses(hardcover)), series = "Fitted Hardcover")
)

Compute the RMSE values for the training data in each case.

print(paste("Paperback RMSE:",
    RMSE((ses(paperback))$x, (ses(paperback))$fitted)))

## [1] "Paperback RMSE: 33.637686782912"

print(paste("Hardcover RMSE:",
    RMSE((ses(hardcover))$x, (ses(hardcover))$fitted)))

## [1] "Hardcover RMSE: 31.9310149844547"

Question 7.6

We will continue with the daily sales of paperback and hardcover books in data set books.

Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

holt(paperback, 4)

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735

holt(hardcover, 4)

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334

Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

Holt’s method has a lower RMSE for both paperback and hardcover book sales. This makes sense, since Holt’s method is better for data with a trend (and both paperback and hardcover sales had an upwards trend).

print(paste("Paperback RMSE:",
    RMSE((ses(paperback))$x, holt(paperback, 4)$fitted)))

## [1] "Paperback RMSE: 31.1369230162347"

print(paste("Hardcover RMSE:",
    RMSE((ses(hardcover))$x, holt(hardcover, 4)$fitted)))

## [1] "Hardcover RMSE: 27.1935779818511"

Compare the forecasts for the two series using both methods. Which do you think is best?

Holt’s method better captures the trend in the sales time series.

grid.arrange(
autoplot(ses(paperback)) + autolayer(fitted(ses(paperback)), series = "SES") + autolayer(fitted(holt(paperback)), series = "HOLT") + labs(title = "Paperback"),
autoplot(ses(hardcover)) + autolayer(fitted(ses(hardcover)), series = "SES") + autolayer(fitted(holt(hardcover)), series = "HOLT") + labs(title = "Hardcover")
)

Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

The calculated intervals tend to be narrower than the confidence intervals produced using ses. Confidence intervals produced by holt are similar or wider than the calculated intervals.

ses_pb <- ses(paperback, 4)
holt_pb <- holt(paperback, 4)
ses_hc <- ses(hardcover, 4)
holt_hc <- holt(hardcover, 4)


dataset <- c("paperback", "paperback", "paperback", "paperback", "hardcover", "hardcover", "hardcover", "hardcover")
model <- c("ses", "ses", "holt", "holt", "ses", "ses", "holt", "holt")
calculation <- c("by model", "calculated", "by model", "calculated", "by model", "calculated", "by model", "calculated")
lower <- c(ses_pb$lower[1, "95%"], 
                     ses_pb$mean[1]  - (sd(ses_pb$residuals) * 1.96),
                     holt_pb$lower[1, "95%"], 
                     holt_pb$mean[1]  - (sd(ses_pb$residuals) * 1.96),
                     ses_hc$lower[1, "95%"], 
                     ses_hc$mean[1]  - (sd(ses_hc$residuals) * 1.96),
                     holt_hc$lower[1, "95%"] ,
                     holt_hc$mean[1]  - (sd(ses_hc$residuals) * 1.96)
                     )
upper <-  c(ses_pb$upper[1, "95%"], 
                     ses_pb$mean[1]  + (sd(ses_pb$residuals) * 1.96),
                     holt_pb$upper[1, "95%"], 
                     holt_pb$mean[1]  + (sd(ses_pb$residuals) * 1.96),
                     ses_hc$upper[1, "95%"], 
                     ses_hc$mean[1]  + (sd(ses_hc$residuals) * 1.96),
                     holt_hc$upper[1, "95%"], 
                     holt_hc$mean[1]  + (sd(ses_hc$residuals) * 1.96)
                     )


intervals <- data.frame(cbind(dataset, model, calculation, lower, upper))
datatable(intervals)

Question 7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.

[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Which model gives the best RMSE?

The Box-Cox model gives the best RMSE, and it accounts for the downward trend in the time series.

holt_orig <- holt(eggs, h = 100)
holt_bc <- holt(eggs, lambda = BoxCox.lambda(eggs), h = 100)
holt_damp <- holt(eggs, damped = TRUE, h = 100)
holt_damp_bc <- holt(eggs, damped = TRUE, lambda = BoxCox.lambda(eggs), h = 100)


grid.arrange(
  autoplot(holt_orig) + labs(title = paste("Original", ", RMSE =", round(RMSE(holt_orig$x, holt_orig$fitted),3))),
  autoplot(holt_bc) + labs(title = paste("Box-Cox", ", RMSE =", round(RMSE(holt_bc$x, holt_bc$fitted),3))),
  autoplot(holt_damp) + labs(title = paste("Dampened", ", RMSE =", round(RMSE(holt_damp$x, holt_damp$fitted),3))),
  autoplot(holt_damp_bc) + labs(title = paste("Dampened, Box-Cox", ", RMSE =", round(RMSE(holt_damp_bc$x, holt_damp_bc$fitted),3)))
)

Question 7.8

Recall your retail time series data (from Exercise 3 in Section 2.10).

temp = tempfile(fileext = ".xlsx")
dataURL <- "https://otexts.com/fpp2/extrafiles/retail.xlsx"
download.file(dataURL, destfile=temp, mode='wb')
retaildata <- readxl::read_excel(temp, skip=1)
myts <- ts(retaildata[,"A3349396W"], frequency=12, start=c(1982,4))

Why is multiplicative seasonality necessary for this series?

Multiplicative seasonality is necessary for this series because the seasonality variability increases as time passes.

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

fit1 <- hw(myts, seasonal = "multiplicative")
fit2 <- hw(myts, seasonal = "multiplicative", damped = TRUE)

grid.arrange(
  autoplot(myts) + autolayer(fit1, series="HW multiplicative forecasts", PI=FALSE) + labs(title = paste("multiplicative", ", RMSE =", round(RMSE(fit1$x, fit1$fitted),3))),
  autoplot(myts) + autolayer(fit2, series="HW multiplicative damped forecasts", PI=FALSE) + labs(title = paste("multiplicative damped", ", RMSE =", round(RMSE(fit2$x, fit2$fitted),3)))
)

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

The multiplicative method has only a slightly lower RMSE than the damped multiplicative method. However, the damped method will ensure that the upwards trend does not result in over-forecasting.

Check that the residuals from the best method look like white noise.

There is autocorrelation in the residuals - they do not look like white noise. Both the models have autocorrelation, so neither are ideal.

checkresiduals(fit2)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 580.81, df = 7, p-value < 2.2e-16
## 
## Model df: 17.   Total lags used: 24

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

The test set RMSE with the multiplicative damped model RMSE of 206.193 is much less than the seasonal naïve approach RMSE of 982.853.

myts.train <- window(myts, end = c(2010, 12))
myts.test <- window(myts, start = 2011)

fit <- hw(myts.train, h = 12, seasonal = "multiplicative", damped = TRUE)
autoplot(fit) + labs(title = paste("multiplicative damped", ", RMSE =", round(RMSE(myts.test, fit$mean),3)))

accuracy(fit,myts.test)

##                    ME     RMSE      MAE       MPE      MAPE      MASE
## Training set 25.79479 218.5672 170.8683 0.2537970 1.8342407 0.2840676
## Test set     73.62231 206.1931 175.5720 0.3251777 0.8376698 0.2918874
##                    ACF1 Theil's U
## Training set -0.1987616        NA
## Test set      0.2175472 0.1122748

fc <- snaive(myts.train)
accuracy(fc,myts.test)

##                    ME     RMSE      MAE      MPE     MAPE     MASE      ACF1
## Training set 598.4838 699.2360 601.5060 5.961516 5.997664 1.000000 0.5972618
## Test set     882.2833 982.8526 882.2833 4.172002 4.172002 1.466791 0.5627543
##              Theil's U
## Training set        NA
## Test set     0.4906103

Question 7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

This model likely resulted in overtraining, since the new RMSE is lower for the training set, but much higher for the test set.

fit3 <- forecast(stlf(myts.train, lambda = BoxCox.lambda(myts.train), h = 12), PI=TRUE)
autoplot(fit3)

accuracy(fit3, myts.test)

##                       ME     RMSE      MAE         MPE     MAPE      MASE
## Training set   -4.553829 189.7539 148.8406 -0.01430469 1.604131 0.2474466
## Test set     -465.233916 488.8266 465.2339 -2.23714672 2.237147 0.7734485
##                     ACF1 Theil's U
## Training set -0.11506738        NA
## Test set      0.03267501 0.2737434

Data 624 Homework 5: Exponential Smoothing

Mary Anna Kivenson

March 7, 2020