CUNY DATA624 Homework 5

7.1 Consider the pigs series — the number of pigs slaughtered in Victoria each month.
- a) Use the ses() function in R to find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months.
- b) Compute a 95% prediction interval for the first forecast using \(\hat{y}\) \(\pm\) \(1.96s\) where \(s\) is the standard deviation of the residuals. Compare your interval with the interval produced by R.
7.5 Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.
7.6 We will continue with the daily sales of paperback and hardcover books in data set books
7.7 For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts. Which model gives the best RMSE?
7.8 Recall your retail time series data (from Exercise 3 in Section 2.10).
7.9 For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

library(fpp2)

7.1 Consider the `pigs` series — the number of pigs slaughtered in Victoria each month.

a) Use the `ses()` function in R to find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months.

From the below output, we can see that the optimal values are: \(\alpha\) = 0.2971 \(\ell_0\) = 77260.0561

fc_pigs <- ses(pigs, h=4) # for the next 4 months
summary(fc_pigs)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665 
## 
## Error measures:
##                    ME    RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

b) Compute a 95% prediction interval for the first forecast using \(\hat{y}\) \(\pm\) \(1.96s\) where \(s\) is the standard deviation of the residuals. Compare your interval with the interval produced by R.

# "Manual" calculation
sdres_fc_pigs <- sqrt(var(residuals(fc_pigs)))
fc1 <- fc_pigs$mean[1]
m_low <- fc1 - (1.96*sdres_fc_pigs)
m_hi <- fc1 + (1.96*sdres_fc_pigs)

# From the ses() function
r_low <- fc_pigs$lower[1,2]
r_hi <- fc_pigs$upper[1,2]
pigses_lo <- c(m_low, r_low)
pigses_hi <- c(m_hi, r_hi)

The table below compares the calculation of the prediction interval “Base R” generated versus the prediction interval from the ses() function. They are slightly different and I’ll admit that I’m unclear why, but Hyndman says that this is due to how the variance is calculated and that R takes into account the degrees of freedom properly.

	Lower	Upper
Base R	78679.97	118952.8
SES	78611.97	119020.8

7.5 Data set `books` contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

a) Plot the series and discuss the main features of the data.

h_cov <- autoplot(books[,1], series = "Paperback") +
  autolayer(books[,2], series = "Hardcover") +
  xlab("Day") + ylab("Sales") +
  ggtitle("Hardcover Sales") +
  scale_colour_manual(values=c("blue","gray"),
                      breaks=c("Hardcover",
                               "Paperback"))

p_cov <- autoplot(books[,2], series = "Hardcover")  +
  autolayer(books[,1], series = "Paperback") +
  xlab("Day") + ylab("Sales") +
  ggtitle("Paperback Sales") +
  scale_colour_manual(values=c("gray","red"),
                      breaks=c("Hardcover",
                               "Paperback"))

h_cov

p_cov

Plotted together with the default autoplot function, the graph looks a bit busy, so I split them out. Both hardcover and paperback sales have a clear upward trend. There doesn’t appear to be any seasonality in either trend. Though, it seems tempting to say that there might be a seasonal behavior to paperback sales where a significant rise/fall in sales occurs every 2-3 days, but it’s difficult to definitively say that there is without more data.

b) Use the `ses()` function to forecast each series, and plot the forecasts.

p_fc <- ses(books[,1])
h_fc <- ses(books[,2])

autoplot(books) +
  autolayer(p_fc, series="Paperback", PI=FALSE) +
  autolayer(h_fc, series="Hardcover", PI=FALSE)

c) Compute the RMSE values for the training data in each case.

The ses() function from earlier had calculated the RMSE values for us below.

paste("Paperback training -- RMSE:", accuracy(p_fc)[2])

## [1] "Paperback training -- RMSE: 33.637686782912"

paste("Hardcover training -- RMSE:", accuracy(h_fc)[2])

## [1] "Hardcover training -- RMSE: 31.9310149844547"

7.6 We will continue with the daily sales of paperback and hardcover books in data set `books`

a) Apply Holt’s linear method to the `paperback` and `hardback` series and compute four-day forecasts in each case.

ph_fc <- holt(books[,1], h=4)
hh_fc <- holt(books[,2], h=4)

autoplot(books) +
  autolayer(ph_fc, series="Paperback", PI=FALSE) +
  autolayer(hh_fc, series="Hardcover", PI=FALSE)

b) Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

ses_RMSE <- c(accuracy(p_fc)[2], accuracy(h_fc)[2])
holt_RMSE <- c(accuracy(ph_fc)[2], accuracy(hh_fc)[2])
compare_rmse <- data.frame(ses_RMSE, holt_RMSE)
row.names(compare_rmse) <- c("Paperback", "Hardcover")
knitr::kable(t(compare_rmse))

	Paperback	Hardcover
ses_RMSE	33.63769	31.93101
holt_RMSE	31.13692	27.19358

We can see that there’s been a reduction in the RMSE using the holt method over using the ses method. This is because the data is trended, so the holt method is the better choice.

c) Compare the forecasts for the two series using both methods. Which do you think is best?

As mentioned in the answer to the previous question, the holt method is the better choice because it takes the trend into account for the forecast.

d) Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using `ses` and `holt`

# paperback ses
s <- sqrt(p_fc$model$mse)
Lower <- c(p_fc$mean[1] - 1.96*s, p_fc$lower[1,2])
Upper <- c(p_fc$mean[1] + 1.96*s, p_fc$upper[1,2])
p_ses_pi <- data.frame(Lower, Upper)
row.names(p_ses_pi) <- c("Base R", "SES")
knitr::kable(p_ses_pi)

	Lower	Upper
Base R	141.1798	273.0395
SES	138.8670	275.3523

# hardcover ses
s <- sqrt(h_fc$model$mse)
Lower <- c(h_fc$mean[1] - 1.96*s, h_fc$lower[1,2])
Upper <- c(h_fc$mean[1] + 1.96*s, h_fc$upper[1,2])
h_ses_pi <- data.frame(Lower, Upper)
row.names(h_ses_pi) <- c("Base R", "SES")
knitr::kable(h_ses_pi)

	Lower	Upper
Base R	176.9753	302.1449
SES	174.7799	304.3403

# paperback holt
s <- sqrt(ph_fc$model$mse)
Lower <- c(ph_fc$mean[1] - 1.96*s, ph_fc$lower[1,2])
Upper <- c(ph_fc$mean[1] + 1.96*s, ph_fc$upper[1,2])
ph_ses_pi <- data.frame(Lower, Upper)
row.names(ph_ses_pi) <- c("Base R", "Holt")
knitr::kable(ph_ses_pi)

	Lower	Upper
Base R	148.4384	270.4951
Holt	143.9130	275.0205

# hardcover holt
s <- sqrt(hh_fc$model$mse)
Lower <- c(hh_fc$mean[1] - 1.96*s, hh_fc$lower[1,2])
Upper <- c(hh_fc$mean[1] + 1.96*s, hh_fc$upper[1,2])
hh_ses_pi <- data.frame(Lower, Upper)
row.names(hh_ses_pi) <- c("Base R", "Holt")
knitr::kable(hh_ses_pi)

	Lower	Upper
Base R	196.8745	303.4733
Holt	192.9222	307.4256

As before, the values are slightly different and I would imagine the reasoning to be the same as in question 1.

7.7 For this exercise use data set `eggs`, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the `holt()` function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts. Which model gives the best RMSE?

autoplot(eggs)

From a quick look at the data, there’s a clear downward trend in egg price. We also notice a significant dip around the great depression. There is no obvious seasonality or cyclic behavior.

Below, we play around with the holt function and the lowest I could get the RMSE was below simply from a trial and error approach (not recommended haha).

paste("RMSE value:",accuracy(holt(eggs, h=100, lambda=0.2))[2])

## [1] "RMSE value: 26.3634693983866"

However, as suggested, we can attempt a BoxCox transformation of the timeseries and we get the below.

lambda <- BoxCox.lambda(eggs)
autoplot(BoxCox(eggs,lambda))

When we calculate the RMSE for this, we get a far lower RMSE value:

paste("RMSE value:",accuracy(holt(BoxCox(eggs,lambda), 
                                  h=100, lambda="auto")
                             )[2])

## [1] "RMSE value: 1.0320000444973"

eggs_hfc <- holt(BoxCox(eggs,lambda), h=100, lambda="auto")
autoplot(BoxCox(eggs,lambda)) +
  autolayer(eggs_hfc, series="Holt Forecast")

7.8 Recall your retail time series data (from Exercise 3 in Section 2.10).

retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349398A"],
  frequency=12, start=c(1982,4))
autoplot(myts)

a) Why is multiplicative seasonality necessary for this series?

Because the variation of the seasonality increases over time, a multiplicative approach would be best.

b) Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

From the below analysis of damped vs non-damped trend, it appears that the better fit is a non-damped trend.

autoplot(hw(myts, seasonal='multiplicative', damped = TRUE))

accuracy(hw(myts, seasonal='multiplicative', damped = TRUE))

##                    ME     RMSE      MAE       MPE     MAPE     MASE       ACF1
## Training set 4.244765 29.63087 22.26018 0.2959731 1.785469 0.292681 -0.2184672

autoplot(hw(myts, seasonal='multiplicative', damped = FALSE))

accuracy(hw(myts, seasonal='multiplicative', damped = FALSE))

##                    ME     RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 1.496135 29.43051 22.25676 0.1603693 1.799731 0.2926361 -0.0300701

c) Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

The previous question had already addressed the differences between the two and again, the non-damped method is a better fit.

d) Check that the residuals from the best method look like white noise.

hw_myts <- hw(myts, seasonal='multiplicative', damped = FALSE)
ggAcf(residuals(hw_myts))

According to the ACF plot above, this series does not look like white noise.

e) Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

Shown below, the Holt-Winters’ multiplicative method beats out the seasonal naïve approach by quite a bit.

myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)
fc <- snaive(myts.train)
accuracy(fc,myts.test)

##                     ME      RMSE       MAE      MPE     MAPE     MASE      ACF1
## Training set  73.94114  88.31208  75.13514 6.068915 6.134838 1.000000 0.6312891
## Test set     115.00000 127.92727 115.00000 4.459712 4.459712 1.530576 0.2653013
##              Theil's U
## Training set        NA
## Test set     0.7267171

myts %>% window(end=c(2010,12)) %>%
  hw(seasonal='multiplicative', damped=FALSE) %>%
  accuracy(x=myts)

##                      ME     RMSE      MAE        MPE     MAPE      MASE
## Training set   2.594132 29.87520 22.43516  0.2383033 1.864671 0.2985975
## Test set     -29.588194 49.09806 38.39804 -1.1062616 1.458304 0.5110531
##                     ACF1 Theil's U
## Training set -0.03687893        NA
## Test set      0.09931874  0.267754

7.9 For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

First, let’s apply the BoxCox transformation to our retail data.

lambda <- BoxCox.lambda(myts)
autoplot(BoxCox(myts,lambda))

Nowe we’ll apply STL decomposition on the transformed data.

myts <- ts(
  data=retaildata$A3349398A,
  frequency=12, 
  start=c(1982,4)
)

BC_myts <- BoxCox(myts,lambda)
stl(BC_myts, t.window=13, s.window="periodic", robust=TRUE) %>%
  autoplot()

From here, we now seasonally adjust the data and apply the ETS method.

ets_mytso <- BC_myts %>% window(end=c(2010,12)) %>%
  stl(BC_myts, t.window=13, s.window="periodic", robust=TRUE) %>%
  seasadj() 

forecast(ets(ets_mytso, model='ZZZ'), h=12) %>%
  accuracy(x=BC_myts)

##                         ME       RMSE        MAE          MPE      MAPE
## Training set -0.0000370996 0.05420367 0.04224946  0.002387178 0.3840713
## Test set     -0.0790652742 0.16610888 0.14222642 -0.611307316 1.0740427
##                   MASE        ACF1 Theil's U
## Training set 0.2807380 -0.14673102        NA
## Test set     0.9450621  0.01879428  1.005492

The RMSE is significantly improved with this method: RMSE=0.16610888 Compare that with the Holt-Winters method multiplicative method where the RMSE for the test set was 49.09806

CUNY DATA624 Homework 5

CUNY DATA624 Homework 5

7.1 Consider the pigs series — the number of pigs slaughtered in Victoria each month.

a) Use the ses() function in R to find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months.

b) Compute a 95% prediction interval for the first forecast using \(\hat{y}\) \(\pm\) \(1.96s\) where \(s\) is the standard deviation of the residuals. Compare your interval with the interval produced by R.

7.5 Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

a) Plot the series and discuss the main features of the data.

b) Use the ses() function to forecast each series, and plot the forecasts.

c) Compute the RMSE values for the training data in each case.

7.6 We will continue with the daily sales of paperback and hardcover books in data set books

a) Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

b) Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

c) Compare the forecasts for the two series using both methods. Which do you think is best?

d) Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt

7.8 Recall your retail time series data (from Exercise 3 in Section 2.10).

a) Why is multiplicative seasonality necessary for this series?

b) Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

c) Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

d) Check that the residuals from the best method look like white noise.

e) Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

7.9 For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

7.1 Consider the `pigs` series — the number of pigs slaughtered in Victoria each month.

a) Use the `ses()` function in R to find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months.

7.5 Data set `books` contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

b) Use the `ses()` function to forecast each series, and plot the forecasts.

7.6 We will continue with the daily sales of paperback and hardcover books in data set `books`

a) Apply Holt’s linear method to the `paperback` and `hardback` series and compute four-day forecasts in each case.

d) Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using `ses` and `holt`