Consider the pigs series — the number of pigs slaughtered in Victoria each month.
Use the ses() function in R to find the optimal values of \(\alpha\) and \(\iota_0\), and generate forecasts for the next four months.
pigs_ses <- ses(pigs, h = 4)
summary(pigs_ses)
Forecast method: Simple exponential smoothing
Model Information:
Simple exponential smoothing
Call:
ses(y = pigs, h = 4)
Smoothing parameters:
alpha = 0.2971
Initial states:
l = 77260.0561
sigma: 10308.58
AIC AICc BIC
4462.955 4463.086 4472.665
Error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
Forecasts:
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
Sep 1995 98816.41 85605.43 112027.4 78611.97 119020.8
Oct 1995 98816.41 85034.52 112598.3 77738.83 119894.0
Nov 1995 98816.41 84486.34 113146.5 76900.46 120732.4
Dec 1995 98816.41 83958.37 113674.4 76092.99 121539.8
The ses function generates an \(\alpha\) of 0.2971488 and an \(\iota_0\) of 77260.0561459. The book states that if alpha is small, (close to 0) more weight is given to the observations from distant past. alternatively if alpha is large, more weight is given to more recent observations. In this case, the alpha indicates weight provided to more distant values
Compute a 95% prediction interval for the first forecast using \(\hat{y} \pm\) 1.96 \(s\) where \(s\) is the standard deviation of the residuals.
# Get the first forecast
y_hat <- pigs_ses$mean[1]
# Get the standard deviation of the residuals
s <- sd(pigs_ses$residuals)
lower_ci <- y_hat - 1.96 * s
upper_ci <- y_hat + 1.96 * s
The 95% CI is from 78680 to 118953,
The CI produced by R is wider
Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.
Plot the series and discuss the main features of the data.
autoplot(books) +
ggtitle("Daily Sale of Books")+
defaulttheme
The features of the dataset show an upward trend in booksales overtime for both paperback and hardcover. they seem to seasonally change coincidentally as well. However this is a short trend in data (30 days)
Use the ses() function to forecast each series, and plot the forecasts.
ses_paper <- ses(books[, 1], h = 4)
autoplot(ses_paper) +
ylab("Paperback Sales")+
defaulttheme
ses_hard <- ses(books[, 2], h = 4)
autoplot(ses_hard) +
ylab("Hardback Sales")+
defaulttheme
Compute the RMSE values for the training data in each case.
accuracy(ses_paper)
ME RMSE MAE MPE MAPE MASE ACF1
Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303 -0.2117522
accuracy(ses_hard)
ME RMSE MAE MPE MAPE MASE ACF1
Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887 -0.1417763
The RMSE for the paperback model is 33.64 and 31.93 for the hardback model. This shows that we have slightly better predictions on the paperbacks than on the hardback books however the rmse is still quite large relative to the books sold for each
We will continue with the daily sales of paperback and hardcover books in data set books.
Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.
holt_paper <- holt(books[, 1], h = 4)
summary(holt_paper)
Forecast method: Holt's method
Model Information:
Holt's method
Call:
holt(y = books[, 1], h = 4)
Smoothing parameters:
alpha = 0.0001
beta = 0.0001
Initial states:
l = 170.699
b = 1.2621
sigma: 33.4464
AIC AICc BIC
318.3396 320.8396 325.3456
Error measures:
ME RMSE MAE MPE MAPE MASE
Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
ACF1
Training set -0.1750792
Forecasts:
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
31 209.4668 166.6035 252.3301 143.9130 275.0205
32 210.7177 167.8544 253.5811 145.1640 276.2715
33 211.9687 169.1054 254.8320 146.4149 277.5225
34 213.2197 170.3564 256.0830 147.6659 278.7735
holt_hard <- holt(books[, 2], h = 4)
autoplot(holt_hard) +
ylab("Hardback Sales")+
defaulttheme
summary(holt_hard)
Forecast method: Holt's method
Model Information:
Holt's method
Call:
holt(y = books[, 2], h = 4)
Smoothing parameters:
alpha = 0.0001
beta = 0.0001
Initial states:
l = 147.7935
b = 3.303
sigma: 29.2106
AIC AICc BIC
310.2148 312.7148 317.2208
Error measures:
ME RMSE MAE MPE MAPE MASE
Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
ACF1
Training set -0.03245186
Forecasts:
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
31 250.1739 212.7390 287.6087 192.9222 307.4256
32 253.4765 216.0416 290.9113 196.2248 310.7282
33 256.7791 219.3442 294.2140 199.5274 314.0308
34 260.0817 222.6468 297.5166 202.8300 317.3334
we can observe the trend in the dataset with Holt’s method as opposed to the SES method. it does not have a seasonal component, however it does do a better job with the direction
Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.
Holt’s method does a better job with the predictions than the SES models. Likely due to containing the trend component. as observed in the accuracy computation below, the RMSE is smaller for both the paperback and hardback series. This is expected based on the comparison of the two forecasted plots.
| Type | SES | Holt’s Method |
|---|---|---|
| Paperback | 33.63769 | 31.13692 |
| Hardback | 31.93101 | 27.19358 |
Compare the forecasts for the two series using both methods. Which do you think is best?
p1 <- autoplot(ses_paper) + ylab("Paperback Sales") + ggtitle("SES Forecast")+defaulttheme
p2 <- autoplot(holt_paper) + ylab("Paperback Sales") + ggtitle("Holt's Forecast")+defaulttheme
grid.arrange(p1, p2, ncol = 1)
p1 <- autoplot(ses_hard) + ylab("Hardback Sales") + ggtitle("SES Forecast")+defaulttheme
p2 <- autoplot(holt_hard) + ylab("Hardback Sales") + ggtitle("Holt's Forecast")+defaulttheme
grid.arrange(p1, p2, ncol = 1)
As stated previously, holts does a better job at predicting the trend under both book sale cases due to the inclusion of the trend component.
Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.
cat("Paperback:", holt_paper$mean[1] - 1.96 * accuracy(holt_paper)[2], "to", holt_paper$mean[1] + 1.96 * accuracy(holt_paper)[2])
Paperback: 148.4384 to 270.4951
cat("Hardback:", holt_hard$mean[1] - 1.96 * accuracy(holt_hard)[2], "to", holt_hard$mean[1] + 1.96 * accuracy(holt_hard)[2])
Hardback: 196.8745 to 303.4733
The 95% CI for the paperback series generated by Holt’s method is 148 to 270. This is wider than the CI computed above. The CI for hardbacks from Holt’s method is 196 to 303
For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.
[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]
Which model gives the best RMSE?
For the following exercise, we will use various techniques within holts function. there are a few arguments in the holts function that allows us to make several changes. we will keep h at 100 for each. the values we will change include, damped, exponential, and lambda and various combinations of them to see which provide best predition
h <- 100
holt_eggs <- holt(eggs, h=h)
lambda_eggs <- holt(eggs, h=h, lambda=TRUE)
damped_eggs <- holt(eggs, h=h, damped=TRUE)
damped_lambda_eggs <- holt(eggs, h=h, damped=TRUE, lambda=TRUE)
exponential_eggs <- holt(eggs, h=h, exponential=TRUE)
exponential_damped_eggs<-holt(eggs, h=h, exponential=TRUE, damped = TRUE)
p1<-autoplot(holt_eggs) + ggtitle("Holt's Method")+defaulttheme
p2<-autoplot(damped_eggs) + ggtitle("Damped")+defaulttheme
p3<-autoplot(lambda_eggs) + ggtitle("Box-Cox")+defaulttheme
p4<-autoplot(damped_lambda_eggs) + ggtitle("Damped & Box-Cox")+defaulttheme
p5<-autoplot(exponential_eggs) + ggtitle("Exponential")+defaulttheme
p6<-autoplot(exponential_damped_eggs) + ggtitle("Exponential & Damped")+defaulttheme
gridExtra::grid.arrange(p1,p2,p3,p4,p5, p6, ncol = 2)
rmse_extract <- function(model){
accuracy(model)[2]
}
eggs_model_df<-data.frame()
data.frame(model = (c("Holt's Linear",
"Box-Cox Transformed",
"Damped",
"Damped and Box-Cox",
"Exponential",
"Exponential and Damped")),
RMSE= c(rmse_extract(holt_eggs),
rmse_extract(lambda_eggs),
rmse_extract(damped_eggs),
rmse_extract(damped_lambda_eggs),
rmse_extract(exponential_eggs),
rmse_extract(exponential_damped_eggs))
) %>% kable() %>%
kable_styling()
| model | RMSE |
|---|---|
| Holt’s Linear | 26.58219 |
| Box-Cox Transformed | 26.55504 |
| Damped | 26.54019 |
| Damped and Box-Cox | 26.73445 |
| Exponential | 26.49795 |
| Exponential and Damped | 26.59113 |
The most accurate model based on the lowest RMSE is using holt’s method with the exponential trend being fitted. however it should be noted that many of these methods are extremely close together in value with only fractions of an error apart.
Recall your retail time series data (from Exercise 3 in Section 2.10).
Why is multiplicative seasonality necessary for this series?
With increasing magnitudes overtime, there is a bias towards using multiplicative seasonality inorder to dampen out the increases in fluctuation. below provides this plot
library(httr)
url1<-"https://otexts.com/fpp2/extrafiles/retail.xlsx"
GET(url1, write_disk(tf <- tempfile(fileext = ".xlsx")))
Response [https://otexts.com/fpp2/extrafiles/retail.xlsx]
Date: 2021-03-15 04:23
Status: 200
Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size: 639 kB
<ON DISK> C:\Users\REGIST~1\AppData\Local\Temp\RtmpKCEC6P\filef944387f3246.xlsx
retaildata <- readxl::read_excel(tf, skip = 1)
myts <- ts(retaildata[,"A3349873A"],
frequency=12, start=c(1982,4))
autoplot(myts)+defaulttheme
ggseasonplot(myts, year.labels = T, year.labels.left = T)+defaulttheme
Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
hw_multiplicative <- hw(myts, seasonal = "multiplicative")
p1<-autoplot(hw_multiplicative) +
ggtitle("Multiplicative") +
ylab("Sales")+defaulttheme
hw_multiplicative_damped <- hw(myts, seasonal = "multiplicative", damped = TRUE)
p2<-autoplot(hw_multiplicative_damped) +
ggtitle("Multiplicative & Damped") +
ylab("Sales")+defaulttheme
grid.arrange(p1,p2, ncol = 1)
Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
cat("RMSE of Multiplicative = ", accuracy(hw_multiplicative)[2])
RMSE of Multiplicative = 13.29378
cat("RMSE of Multiplicative & Damped = ", accuracy(hw_multiplicative_damped)[2])
RMSE of Multiplicative & Damped = 13.30494
For this particular case in the retail data, the non-damped version of the forecast method performes slightly better although this is arguably negligable
Check that the residuals from the best method look like white noise.
checkresiduals(hw_multiplicative)
Ljung-Box test
data: Residuals from Holt-Winters' multiplicative method
Q* = 40.405, df = 8, p-value = 0.000002692
Model df: 16. Total lags used: 24
the residuals show decent scatter around 0 which is what we expect. there seems to be some autocorrelation between 12/13 months
ts_train <- window(myts, end = c(2010, 12))
ts_test <- window(myts, start = 2011)
train_hw <- hw(ts_train, h=36, seasonal = "multiplicative")
train_hw_damped <- hw(ts_train, h=36, seasonal = "multiplicative", damped = T)
train_snaive <- snaive(ts_train, h=36)
p1 <- autoplot(ts_train) +
autolayer(train_hw, series='Multiplicative') +
autolayer(train_hw_damped, series='Multiplicative with damped trend') +
autolayer(train_snaive, series='Seasonal Naive') +
autolayer(ts_test, series='Test set') +
theme(legend.position = "top") +
ylab("Retail Sales")+defaulttheme
p1
accuracy(train_hw, ts_test)[,2]
Training set Test set
9.107356 94.806617
accuracy(train_hw_damped, ts_test)[,2]
Training set Test set
8.681456 111.911266
accuracy(train_snaive, ts_test)[,2]
Training set Test set
20.24576 100.00869
Very interesting exercise, the RMSE for the training sets always far surpassed the rmse for the test sets indicating a bit of over training. the HW method performed best.
For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
# Box-Cox-STLF transformation
stlf_boxcox_train <- stlf(ts_train, lambda =BoxCox.lambda(ts_train))
#ETS transformation on seasonally
ets_train <- ets(seasadj(decompose(ts_train, "multiplicative")))
accuracy(stlf_boxcox_train)
ME RMSE MAE MPE MAPE MASE
Training set -0.646515 8.147517 5.855529 -0.3001347 2.885877 0.3669624
ACF1
Training set 0.02389584
accuracy(ets_train)
ME RMSE MAE MPE MAPE MASE
Training set -0.2829213 8.551268 6.201346 -0.2055279 3.091856 0.3886063
ACF1
Training set 0.02876641
Generally, the stlf_boxcox model seems to perform better than the ETS model as shown with the lower RMSE on the training dataset