Consider the pigs series — the number of pigs slaughtered in Victoria each month.
ses() function in R to find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months.## Mar Apr May Jun Jul Aug
## 1995 106723 84307 114896 106749 87892 100506
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = pigs, h = 4)
##
## Smoothing parameters:
## alpha = 0.2971
##
## Initial states:
## l = 77260.0561
##
## sigma: 10308.58
##
## AIC AICc BIC
## 4462.955 4463.086 4472.665
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Sep 1995 98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995 98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995 98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995 98816.41 83958.37 113674.4 76092.99 121539.8
## [1] 78679.97
## [1] 118952.8
The manually calculated interval is a little narrower than the one calculated by R.
Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.
## Time Series:
## Start = 1
## End = 6
## Frequency = 1
## Paperback Hardcover
## 1 199 139
## 2 172 128
## 3 111 172
## 4 209 139
## 5 161 191
## 6 119 168
## Time Series:
## Start = 25
## End = 30
## Frequency = 1
## Paperback Hardcover
## 25 190 214
## 26 182 200
## 27 222 201
## 28 217 283
## 29 188 220
## 30 247 259
We only have 30 days worth of data, so it’s hard to know if there might be any annual or monthly seasonality. If there is any weekly seasonality it’s difficult to see within that period. There does seem to be a pattern in the ACF plots though with almost all of the lags being positive.
Let’s convert the series to weekly seasonality and try plotting again and decomposition.
paperback %>% decompose(type="multiplicative") %>%
autoplot() + xlab("Week") +
ggtitle("Classical multiplicative decomposition of Paperback books time series")hardcover %>% decompose(type="multiplicative") %>%
autoplot() + xlab("Week") +
ggtitle("Classical multiplicative decomposition of Hardcover books time series")The decompositions make it look like there is weekly seasonality in both paperback and hardcover book sales, but looking at the polar seasonal plots it’s hard to see that being true.
ses() function to forecast each series, and plot the forecasts.##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = paperback, h = 4)
##
## Smoothing parameters:
## alpha = 0.1685
##
## Initial states:
## l = 170.8271
##
## sigma: 34.8183
##
## AIC AICc BIC
## 318.9747 319.8978 323.1783
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7632792 -0.2117522
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 5.285714 207.1097 162.4882 251.7311 138.8670 275.3523
## 5.428571 207.1097 161.8589 252.3604 137.9046 276.3147
## 5.571429 207.1097 161.2382 252.9811 136.9554 277.2639
## 5.714286 207.1097 160.6259 253.5935 136.0188 278.2005
autoplot(sesPaperback, main = "Daily Paperback Book Sales") +
autolayer(sesPaperback$fitted, series="Fitted") +
ylab("Sales") + xlab("Day")##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = hardcover, h = 4)
##
## Smoothing parameters:
## alpha = 0.3283
##
## Initial states:
## l = 149.2861
##
## sigma: 33.0517
##
## AIC AICc BIC
## 315.8506 316.7737 320.0542
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.6997539 -0.1417763
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 5.285714 239.5601 197.2026 281.9176 174.7799 304.3403
## 5.428571 239.5601 194.9788 284.1414 171.3788 307.7414
## 5.571429 239.5601 192.8607 286.2595 168.1396 310.9806
## 5.714286 239.5601 190.8347 288.2855 165.0410 314.0792
autoplot(sesHardcover, main = "Daily Hardcover Book Sales") +
autolayer(sesHardcover$fitted, series="Fitted") +
ylab("Sales") + xlab("Day")kable(round(accuracy(sesPaperback),2)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 7.18 | 33.64 | 27.84 | 0.47 | 15.58 | 0.76 | -0.21 |
kable(round(accuracy(sesHardcover),2)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 9.17 | 31.93 | 26.77 | 2.64 | 13.39 | 0.7 | -0.14 |
We will continue with the daily sales of paperback and hardcover books in data set books.
paperback and hardback series and compute four-day forecasts in each case.##
## Forecast method: Holt's method
##
## Model Information:
## Holt's method
##
## Call:
## holt(y = paperback, h = 4)
##
## Smoothing parameters:
## alpha = 1e-04
## beta = 1e-04
##
## Initial states:
## l = 170.699
## b = 1.2621
##
## sigma: 33.4464
##
## AIC AICc BIC
## 318.3396 320.8396 325.3456
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.7177104
## ACF1
## Training set -0.1750792
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 5.285714 209.4668 166.6035 252.3301 143.9130 275.0205
## 5.428571 210.7177 167.8544 253.5811 145.1640 276.2715
## 5.571429 211.9687 169.1054 254.8320 146.4149 277.5225
## 5.714286 213.2197 170.3564 256.0830 147.6659 278.7735
##
## Forecast method: Holt's method
##
## Model Information:
## Holt's method
##
## Call:
## holt(y = hardcover, h = 4)
##
## Smoothing parameters:
## alpha = 1e-04
## beta = 1e-04
##
## Initial states:
## l = 147.7935
## b = 3.303
##
## sigma: 29.2106
##
## AIC AICc BIC
## 310.2148 312.7148 317.2208
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6052024
## ACF1
## Training set -0.03245186
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 5.285714 250.1739 212.7390 287.6087 192.9222 307.4256
## 5.428571 253.4765 216.0416 290.9113 196.2248 310.7282
## 5.571429 256.7791 219.3442 294.2140 199.5274 314.0308
## 5.714286 260.0817 222.6468 297.5166 202.8300 317.3334
kable(round(accuracy(holtPaperback),2)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | -3.72 | 31.14 | 26.18 | -5.51 | 15.58 | 0.72 | -0.18 |
kable(round(accuracy(holtHardcover),2)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | -0.14 | 27.19 | 23.16 | -2.11 | 12.16 | 0.61 | -0.03 |
holt_PB_RMSE <- round(accuracy(holtPaperback)[,"RMSE"],2)
holt_HC_RMSE <- round(accuracy(holtHardcover)[,"RMSE"],2)The RMSE for the paperback book sales data using simple exponential smoothing is 33.64 vs 31.14 when using Holt’s method, which is a reduction of 2.5. The RMSE for the hardcover book sales data using simple exponential smoothing is 31.93 vs 27.19 when using Holt’s method, which is a reduction of 4.74. Holt’s method is a better predictor for the books time series since there does seem to be an upward trend in both paperback and hardcover book sales.
p1 <- autoplot(sesPaperback, main = "Daily Paperback Book Sales - SES") +
autolayer(sesPaperback$fitted, series="Fitted") +
ylab("Sales") + xlab("Day")
p2 <- autoplot(sesHardcover, main = "Daily Hardcover Book Sales - SES") +
autolayer(sesHardcover$fitted, series="Fitted") +
ylab("Sales") + xlab("Day")
p3 <- autoplot(holtPaperback, main = "Daily Paperback Book Sales - Holt") +
autolayer(holtPaperback$fitted, series="Fitted") +
ylab("Sales") + xlab("Day")
p4 <- autoplot(holtHardcover, main = "Daily Hardcover Book Sales - Holt") +
autolayer(holtHardcover$fitted, series="Fitted") +
ylab("Sales") + xlab("Day")
grid.arrange(p1, p2, p3, p4, nrow = 2)Since both methods result in a straight line forecast and do not take seasonality into account, the Holt method that includes the trend upward is a better forecaster than simple exponential smoothing.
ses and holt.conf_int <- function(fc, n){
sd <- accuracy(fc)[,"RMSE"]
c(fc$mean[n] - 1.96 * sd, fc$mean[n] + 1.96 * sd)
}
ses_pb_conf_int <- conf_int(sesPaperback, 1)
ses_hc_conf_int <- conf_int(sesHardcover, 1)
holt_pb_conf_int <- conf_int(holtPaperback, 1)
holt_hc_conf_int <- conf_int(holtHardcover, 1)
manual_calc <- data.frame(ses_pb_conf_int, ses_hc_conf_int, holt_pb_conf_int, holt_hc_conf_int, row.names = c("lower", "upper"))
kable(manual_calc) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ses_pb_conf_int | ses_hc_conf_int | holt_pb_conf_int | holt_hc_conf_int | |
|---|---|---|---|---|
| lower | 141.1798 | 176.9753 | 148.4384 | 196.8745 |
| upper | 273.0395 | 302.1449 | 270.4951 | 303.4733 |
conf_int <- function(fc, n){
c(fc$lower[,"95%"][1], fc$upper[,"95%"][1])
}
ses_pb_conf_int_r <- conf_int(sesPaperback, 1)
ses_hc_conf_int_r <- conf_int(sesHardcover, 1)
holt_pb_conf_int_r <- conf_int(holtPaperback, 1)
holt_hc_conf_int_r <- conf_int(holtHardcover, 1)
r_calc <- data.frame(ses_pb_conf_int_r, ses_hc_conf_int_r, holt_pb_conf_int_r, holt_hc_conf_int_r, row.names = c("lower", "upper"))
kable(r_calc) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ses_pb_conf_int_r | ses_hc_conf_int_r | holt_pb_conf_int_r | holt_hc_conf_int_r | |
|---|---|---|---|---|
| lower | 138.8670 | 174.7799 | 143.9130 | 192.9222 |
| upper | 275.3523 | 304.3403 | 275.0205 | 307.4256 |
kable(data.frame(r_calc - manual_calc)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ses_pb_conf_int_r | ses_hc_conf_int_r | holt_pb_conf_int_r | holt_hc_conf_int_r | |
|---|---|---|---|---|
| lower | -2.312775 | -2.195432 | -4.525411 | -3.952289 |
| upper | 2.312775 | 2.195432 | 4.525411 | 3.952289 |
Once again the confidence intervals that were manually calculated are just a little narrower than the ones calculated by the ses and holt functions in R.
For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.
[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]
Which model gives the best RMSE?
## Holt's method
##
## Call:
## holt(y = eggs, h = 100)
##
## Smoothing parameters:
## alpha = 0.8124
## beta = 1e-04
##
## Initial states:
## l = 314.7232
## b = -2.7222
##
## sigma: 27.1665
##
## AIC AICc BIC
## 1053.755 1054.437 1066.472
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 0.0449909 | 26.58219 | 19.18491 | -1.142201 | 9.653791 | 0.9463626 | 0.013482 |
## Damped Holt's method
##
## Call:
## holt(y = eggs, h = 100, damped = TRUE, phi = 0.9)
##
## Smoothing parameters:
## alpha = 0.8464
## beta = 1e-04
## phi = 0.9
##
## Initial states:
## l = 297.4547
## b = -3.1897
##
## sigma: 27.3608
##
## AIC AICc BIC
## 1054.045 1054.727 1066.761
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | -2.584906 | 26.62317 | 19.53231 | -2.832104 | 10.10674 | 0.9634993 | -0.0050198 |
## Holt's method
##
## Call:
## holt(y = eggs, h = 100, lambda = "auto")
##
## Box-Cox transformation: lambda= 0.3956
##
## Smoothing parameters:
## alpha = 0.809
## beta = 1e-04
##
## Initial states:
## l = 21.0322
## b = -0.1144
##
## sigma: 1.0549
##
## AIC AICc BIC
## 443.0310 443.7128 455.7475
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 0.7736844 | 26.39376 | 18.96387 | -1.072416 | 9.620095 | 0.9354593 | 0.0388715 |
## Damped Holt's method
##
## Call:
## holt(y = eggs, h = 100, damped = TRUE, phi = 0.9, lambda = "auto")
##
## Box-Cox transformation: lambda= 0.3956
##
## Smoothing parameters:
## alpha = 0.8468
## beta = 1e-04
## phi = 0.9
##
## Initial states:
## l = 20.873
## b = 0.125
##
## sigma: 1.0694
##
## AIC AICc BIC
## 444.5430 445.2248 457.2595
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | -2.979642 | 26.54716 | 19.2466 | -2.932978 | 10.00465 | 0.9494056 | -0.0019255 |
fc1_RMSE <- accuracy(fc1eggs)[,"RMSE"]
fc2_RMSE <- accuracy(fc2eggs)[,"RMSE"]
fc3_RMSE <- accuracy(fc3eggs)[,"RMSE"]
fc4_RMSE <- accuracy(fc4eggs)[,"RMSE"]
kable(data.frame(fc1_RMSE, fc2_RMSE, fc3_RMSE, fc4_RMSE)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| fc1_RMSE | fc2_RMSE | fc3_RMSE | fc4_RMSE |
|---|---|---|---|
| 26.58219 | 26.62317 | 26.39376 | 26.54716 |
The Holt model with box-cox transformation and no dampening seems to produce the best forecasts and also has the best (lowest) RMSE. The first model without any dampening or box-cox transformation drops the price into the negative which is impossible. The dampened but untransformed data levels out at it’s current price and stays there forever. The box-cox transformed data slopes down at first but slowly curves upwards so that it levels out at about 0. It also has the narrowest confidence intervals. The box-cox transformed and dampened forecast has confidence intervals that almost never go into the negatives, so in that way it may be the best model, but the price forecast levels out at the current price without change and the upper confidence limit is extremely wide.
Recall your retail time series data (from Exercise 3 in Section 2.10).
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
retail <- ts(retaildata[,"A3349335T"],
frequency=12, start=c(1982,4))
autoplot(retail)Multiplicative seasonality is needed because you can see in the plot above that the seasonal variation increases with the level of the trend. As the trend increases, so does the seasonal variation.
## Holt-Winters' multiplicative method
##
## Call:
## hw(y = retail, h = 100, seasonal = "multiplicative")
##
## Smoothing parameters:
## alpha = 0.3253
## beta = 0.0129
## gamma = 0.0255
##
## Initial states:
## l = 304.256
## b = 2.1149
## s = 1.0213 0.9379 1.0298 1.1195 1.015 1.0226
## 0.9666 1.0035 0.9715 0.9468 0.9983 0.9673
##
## sigma: 0.0275
##
## AIC AICc BIC
## 4769.995 4771.681 4837.022
kable(accuracy(fc1retail)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 0.9212824 | 25.20381 | 18.77683 | 0.0685623 | 1.979316 | 0.3016982 | -0.1217931 |
## Damped Holt-Winters' multiplicative method
##
## Call:
## hw(y = retail, h = 100, seasonal = "multiplicative", damped = TRUE,
##
## Call:
## phi = 0.98)
##
## Smoothing parameters:
## alpha = 0.271
## beta = 0.0452
## gamma = 0.0117
## phi = 0.98
##
## Initial states:
## l = 304.0294
## b = 2.7058
## s = 1.0195 0.9319 1.0281 1.1258 1.0214 1.0186
## 0.9747 0.9976 0.9757 0.945 0.9897 0.9717
##
## sigma: 0.0276
##
## AIC AICc BIC
## 4771.963 4773.649 4838.991
kable(accuracy(fc2retail)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 2.351492 | 25.28197 | 18.70815 | 0.1821815 | 1.960331 | 0.3005947 | -0.0360754 |
The RMSE is slightly lower for the first undamped Holt Winter’s model. The plot also seems like a more reasonable forecast. There is no reason to think that the sales will level off the way they do in the damped model.
fc1_retail_RMSE <- accuracy(fc1retail)[,"RMSE"]
fc2_retail_RMSE <- accuracy(fc2retail)[,"RMSE"]
kable(data.frame(fc1_retail_RMSE, fc2_retail_RMSE)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| fc1_retail_RMSE | fc2_retail_RMSE |
|---|---|
| 25.20381 | 25.28197 |
The RMSE is very slightly lower for the first model and the plot looks like a more reasonable forecast, so I prefer the first undamped model.
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method
## Q* = 250.64, df = 8, p-value < 2.2e-16
##
## Model df: 16. Total lags used: 24
The residuals from the best Holt Winter’s Model do look like white noise for the most part although there does seem to be a reduction of the variance over time, so there is some pattern to the residuals that might be a problem.
Just out of curiosity let’s see what a Box-Cox transformation and additive seasonality does to our model.
## Holt-Winters' additive method
##
## Call:
## hw(y = retail, h = 100, seasonal = "additive", lambda = "auto")
##
## Box-Cox transformation: lambda= 0.1939
##
## Smoothing parameters:
## alpha = 0.2674
## beta = 1e-04
## gamma = 0.1601
##
## Initial states:
## l = 10.5163
## b = 0.0189
## s = 0.1257 -0.1583 -0.0427 0.5178 0.0588 0.0416
## -0.1294 -0.0101 -0.0075 -0.1289 -0.0926 -0.1744
##
## sigma: 0.0925
##
## AIC AICc BIC
## 467.9192 469.6051 534.9467
kable(accuracy(fc3retail)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | -0.3742618 | 25.96042 | 19.22513 | 0.0274891 | 1.918142 | 0.3089013 | -0.0138325 |
fc4retail <- hw(retail, damped=TRUE, phi=0.98, lambda="auto",
seasonal="additive", h=100)
fc4retail$model## Damped Holt-Winters' additive method
##
## Call:
## hw(y = retail, h = 100, seasonal = "additive", damped = TRUE,
##
## Call:
## phi = 0.98, lambda = "auto")
##
## Box-Cox transformation: lambda= 0.1939
##
## Smoothing parameters:
## alpha = 0.3182
## beta = 0.0222
## gamma = 0.1757
## phi = 0.98
##
## Initial states:
## l = 10.5024
## b = 0.0328
## s = 0.0096 -0.1515 -0.068 0.5575 0.0821 0.0351
## -0.0987 -0.0165 -0.0248 -0.1217 -0.0275 -0.1756
##
## sigma: 0.095
##
## AIC AICc BIC
## 487.1551 488.8410 554.1826
kable(accuracy(fc4retail)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 3.493937 | 26.69833 | 19.76242 | 0.2767205 | 1.96255 | 0.3175342 | -0.0550526 |
fc3_retail_RMSE <- accuracy(fc3retail)[,"RMSE"]
fc4_retail_RMSE <- accuracy(fc4retail)[,"RMSE"]
kable(data.frame(fc3_retail_RMSE, fc4_retail_RMSE)) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| fc3_retail_RMSE | fc4_retail_RMSE |
|---|---|
| 25.96042 | 26.69833 |
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' additive method
## Q* = 270.76, df = 8, p-value < 2.2e-16
##
## Model df: 16. Total lags used: 24
The plots look better but the RMSE is slightly higher for both models than either of the first two. Also the residual plot looks more like white noise with no discernible pattern or decrease in variability over time. So even though the RMSE is slightly higher, I think I would choose the undamped Holt Winter’s model with additive seasonality and Box-Cox transformation.
train <- window(retail, end=c(2010,12))
test <- window(retail, start=2011)
fc <- hw(train, seasonal="multiplicative", h=100)
kable(data.frame(accuracy(fc,test))) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | Theil.s.U | |
|---|---|---|---|---|---|---|---|---|
| Training set | 2.658058 | 25.00649 | 18.25149 | 0.2228273 | 1.965971 | 0.2958851 | -0.0067375 | NA |
| Test set | -63.670299 | 77.04807 | 65.91346 | -2.9161237 | 3.023249 | 1.0685599 | 0.4473749 | 0.5550438 |
The RMSE for a Holt Winter’s multiplicative seasonality model is much better at 77.04807 then it was for the Naive approach taken in exercise 3.8 which had an RMSE of 109.62545.
For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
lambda <- BoxCox.lambda(retail)
boxcox_STL_data <- retail %>%
BoxCox(lambda = lambda) %>%
mstl()
autoplot(boxcox_STL_data)boxcox_STL_adj_data <- seasadj(boxcox_STL_data)
autoplot(retail) +
autolayer(inv_box_cox(boxcox_STL_adj_data, lambda = lambda),
series="seasonally adjusted data") +
ggtitle("Backtransformed seasonally adjusted data")train <- window(boxcox_STL_adj_data, end=c(2010,12))
test <- window(boxcox_STL_adj_data, start=2011)
fit <- ets(train)
fit## ETS(A,A,N)
##
## Call:
## ets(y = train)
##
## Smoothing parameters:
## alpha = 0.2394
## beta = 1e-04
##
## Initial states:
## l = 10.444
## b = 0.0202
##
## sigma: 0.0823
##
## AIC AICc BIC
## 298.8909 299.0678 318.1086
fc <- forecast(fit, h=100)
autoplot(fc, lambda = lambda) +
autolayer(boxcox_STL_adj_data,
series="seasonally adjusted data")autoplot(inv_box_cox(fc$mean, lambda = lambda)) +
autolayer(inv_box_cox(boxcox_STL_adj_data, lambda = lambda),
series="seasonally adjusted data") +
ggtitle("Backtransformed seasonally adjusted data and point forecast from ETS(A,A,N)")kable(data.frame(accuracy(fc,test))) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"))| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | Theil.s.U | |
|---|---|---|---|---|---|---|---|---|
| Training set | 0.0010537 | 0.0818318 | 0.0646293 | 0.0105154 | 0.4715236 | 0.2635096 | -0.1163447 | NA |
| Test set | -0.2236219 | 0.2536921 | 0.2251834 | -1.2619503 | 1.2708319 | 0.9181288 | 0.7350465 | 3.30654 |
##
## Ljung-Box test
##
## data: Residuals from ETS(A,A,N)
## Q* = 370.11, df = 20, p-value < 2.2e-16
##
## Model df: 4. Total lags used: 24
## Training set Test set
## 1.084576 1.281013
## ETS(A,A,N)
##
## Call:
## ets(y = boxcox_STL_adj_data)
##
## Smoothing parameters:
## alpha = 0.2571
## beta = 0.0025
##
## Initial states:
## l = 10.4664
## b = 0.0213
##
## sigma: 0.0809
##
## AIC AICc BIC
## 354.0372 354.1972 373.7512
fc <- forecast(fit, h=100)
autoplot(fc, lambda = lambda) +
autolayer(boxcox_STL_adj_data,
series="seasonally adjusted data")autoplot(inv_box_cox(fc$mean, lambda = lambda)) +
autolayer(inv_box_cox(boxcox_STL_adj_data, lambda = lambda),
series="seasonally adjusted data") +
ggtitle("Backtransformed seasonally adjusted data and point forecast from ETS(A,A,N)")##
## Ljung-Box test
##
## data: Residuals from ETS(A,A,N)
## Q* = 389.04, df = 20, p-value < 2.2e-16
##
## Model df: 4. Total lags used: 24
## [1] 1.08312
Whether we use the entire data set for training or only the data up to the end of 2010, either way the ets function seems to result in an overly optimistic forecast for the retail data. We can see a leveling off that starts in about 2010, but the ets function forecasts that retail sales will continue to rise in an exponential growth pattern. It’s hard to compare the RMSE because I’m not sure I backtransformed it correctly, but if so the RMSE does seem to be significantly smaller at 1.28 for this model than for any of the other previous models that all had RMSE’s above 25.