pigs time seriesautoplot(pigs) +
labs(title = "Monthly slaughter of pigs in Victoria",
subtitle = "Jan 1980 - Aug 1995",
x = "Year", y = "")
We use the ses function to fit a simple exponential smoothing model to the time series. Estimated parameter values are output in the model summary below:
The model summary also includes forecast values and prediction intervals for the next four months.
fc <- ses(pigs, h = 4)
sumfc <- summary(fc)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = pigs, h = 4)
##
## Smoothing parameters:
## alpha = 0.2971
##
## Initial states:
## l = 77260.0561
##
## sigma: 10308.58
##
## AIC AICc BIC
## 4462.955 4463.086 4472.665
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249
## ACF1
## Training set 0.01282239
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Sep 1995 98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995 98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995 98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995 98816.41 83958.37 113674.4 76092.99 121539.8
We plot the forecast and fitted values together with the original time series below.
autoplot(pigs) +
autolayer(fitted(fc), series = "Fitted values") +
autolayer(fc, series = "Forecast", PI = FALSE) +
labs(title = "Monthly slaughter of pigs in Victoria",
subtitle = "Jan 1980 - Aug 1995",
x = "Year", y = "")
First we compute the standard deviation of the residuals. From the text (equation 5.3), we know that this can be computed as:
\[\hat{\sigma}_e = \sqrt{\frac{1}{T-k-1} \sum_{t=1}^{T} e_t^2}\]
where \(k+1\) is the number of estimated parameters. In this case, there are 2 estimated parameters, so we use \(T-2\) in the denominator above. Then we can compute the residual standard error:
# manually compute rse
s1 <- sqrt(sum(residuals(fc)^2) / (length(residuals(fc)) - 2))
# get rse from model
s2 <- sqrt(fc$model$sigma2)
c("Calculated" = s1, "From model summary" = s2)
## Calculated From model summary
## 10308.58 10308.58
Note that this agrees with the residual standard error as output in the model summary above. Next we use \(\hat{y} \pm 1.96 s\) for the 95% prediction interval for the first month:
# 95% prediction interval
fc$mean[1] + c(Lower95 = -1, Upper95 = 1) * 1.96 * s1
## Lower95 Upper95
## 78611.6 119021.2
fc$mean[1] + c(Lower95 = -1, Upper95 = 1) * qnorm(0.975) * s1
## Lower95 Upper95
## 78611.97 119020.84
The first prediction interval is slightly different from the 95% prediction interval in the model summary above. However, once we use a more precise estimate for the critical value from the normal distribution (1.959964), the prediction interval is identical to the output in the model summary.
books datasetThe paperback and hardcover time series are plotted below. Main features of the time series include:
autoplot(books) +
labs(title = "Daily same-store sales of paperback & hardcover books",
x = "Day", y = "")
Although the time series shows an upward trend, we start with a simple exponential smoothing model. The 4-day forecasts are shown in the model summaries and are plotted below.
fc1 <- ses(books[ , "Paperback"], h = 4)
fc2 <- ses(books[ , "Hardcover"], h = 4)
summary(fc1)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = books[, "Paperback"], h = 4)
##
## Smoothing parameters:
## alpha = 0.1685
##
## Initial states:
## l = 170.8271
##
## sigma: 34.8183
##
## AIC AICc BIC
## 318.9747 319.8978 323.1783
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303
## ACF1
## Training set -0.2117522
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 207.1097 162.4882 251.7311 138.8670 275.3523
## 32 207.1097 161.8589 252.3604 137.9046 276.3147
## 33 207.1097 161.2382 252.9811 136.9554 277.2639
## 34 207.1097 160.6259 253.5935 136.0188 278.2005
summary(fc2)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = books[, "Hardcover"], h = 4)
##
## Smoothing parameters:
## alpha = 0.3283
##
## Initial states:
## l = 149.2861
##
## sigma: 33.0517
##
## AIC AICc BIC
## 315.8506 316.7737 320.0542
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887
## ACF1
## Training set -0.1417763
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 239.5601 197.2026 281.9176 174.7799 304.3403
## 32 239.5601 194.9788 284.1414 171.3788 307.7414
## 33 239.5601 192.8607 286.2595 168.1396 310.9806
## 34 239.5601 190.8347 288.2855 165.0410 314.0792
autoplot(books) +
autolayer(fc1, PI = FALSE, series = "Paperback") +
autolayer(fc2, PI = FALSE, series = "Hardcover") +
labs(title = "Daily same-store sales of paperback & hardcover books",
subtitle = "4-day forecasts with simple exponential smoothing",
x = "Day", y = "")
We compute the RMSE for each series on the training set.
rmse1 <- sqrt(mean(residuals(fc1)^2))
rmse2 <- sqrt(mean(residuals(fc2)^2))
c(Paperback = rmse1, Hardcover = rmse2)
## Paperback Hardcover
## 33.63769 31.93101
We can verify these calculated values using the accuracy function, which shows that the results agree.
accuracy(fc1)
## ME RMSE MAE MPE MAPE MASE
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303
## ACF1
## Training set -0.2117522
accuracy(fc2)
## ME RMSE MAE MPE MAPE MASE
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887
## ACF1
## Training set -0.1417763
books datasetWe use the holt function to create linear trend models for both time series. 4-day forecasts are shown in the model summaries and ploted below.
fc3 <- holt(books[ , "Paperback"], h = 4)
fc4 <- holt(books[ , "Hardcover"], h = 4)
summary(fc3)
##
## Forecast method: Holt's method
##
## Model Information:
## Holt's method
##
## Call:
## holt(y = books[, "Paperback"], h = 4)
##
## Smoothing parameters:
## alpha = 1e-04
## beta = 1e-04
##
## Initial states:
## l = 170.699
## b = 1.2621
##
## sigma: 33.4464
##
## AIC AICc BIC
## 318.3396 320.8396 325.3456
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
## ACF1
## Training set -0.1750792
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 209.4668 166.6035 252.3301 143.9130 275.0205
## 32 210.7177 167.8544 253.5811 145.1640 276.2715
## 33 211.9687 169.1054 254.8320 146.4149 277.5225
## 34 213.2197 170.3564 256.0830 147.6659 278.7735
summary(fc4)
##
## Forecast method: Holt's method
##
## Model Information:
## Holt's method
##
## Call:
## holt(y = books[, "Hardcover"], h = 4)
##
## Smoothing parameters:
## alpha = 1e-04
## beta = 1e-04
##
## Initial states:
## l = 147.7935
## b = 3.303
##
## sigma: 29.2106
##
## AIC AICc BIC
## 310.2148 312.7148 317.2208
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
## ACF1
## Training set -0.03245186
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 250.1739 212.7390 287.6087 192.9222 307.4256
## 32 253.4765 216.0416 290.9113 196.2248 310.7282
## 33 256.7791 219.3442 294.2140 199.5274 314.0308
## 34 260.0817 222.6468 297.5166 202.8300 317.3334
autoplot(books) +
autolayer(fc1, PI = FALSE, series = "Paperback - Simple") +
autolayer(fc2, PI = FALSE, series = "Hardcover - Simple") +
autolayer(fc3, PI = FALSE, series = "Paperback - Holt") +
autolayer(fc4, PI = FALSE, series = "Hardcover - Holt") +
labs(title = "Daily same-store sales of paperback & hardcover books",
subtitle = "4-day forecasts with linear trend model & simple exponential smoothing",
x = "Day", y = "")
We calculate the RMSE for both models and check the results using the accuracy function.
rmse3 <- sqrt(mean(residuals(fc3)^2))
rmse4 <- sqrt(mean(residuals(fc4)^2))
c(Paperback = rmse3, Hardcover = rmse4)
## Paperback Hardcover
## 31.13692 27.19358
accuracy(fc3)
## ME RMSE MAE MPE MAPE MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
## ACF1
## Training set -0.1750792
accuracy(fc4)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
## ACF1
## Training set -0.03245186
Comparing the simple exponential smoothing and linear trend models, we see that the linear trend model produces lower training RMSEs for both time series. This is to be expected since both time series show a consistent upward trend, and the linear model includes a parameter to account for this trend. We also expect that the linear trend model would generate more accurate forecasts outside the training data.
rbind(Simple = c(rmse1, rmse2), Holt = c(rmse3, rmse4)) %>%
kable(caption = "RMSE comparison",
col.names= c("Paperback", "Hardcover"))
| Paperback | Hardcover | |
|---|---|---|
| Simple | 33.63769 | 31.93101 |
| Holt | 31.13692 | 27.19358 |
4-day forecasts of the books time series using both methods is plotted in part (a) above, and is also summarized below. Holt’s linear trend model is clearly preferable in this case, since both time series exhibit obvious upward trends.
cbind(Day = 1:4,
"Paper-Simple" = fc1$mean,
"Paper-Holt" = fc3$mean,
"Hard-Simple" = fc2$mean,
"Hard-Holt" = fc4$mean) %>%
kable(caption = "Forecast comparison")
| Day | Paper-Simple | Paper-Holt | Hard-Simple | Hard-Holt |
|---|---|---|---|---|
| 1 | 207.1097 | 209.4668 | 239.5601 | 250.1739 |
| 2 | 207.1097 | 210.7177 | 239.5601 | 253.4765 |
| 3 | 207.1097 | 211.9687 | 239.5601 | 256.7791 |
| 4 | 207.1097 | 213.2197 | 239.5601 | 260.0817 |
We compute 1-day prediction intervals using the RMSEs calculated above and assuming normally distributed errors (“FromRMSE”), and compare to the prediction intervals as reported in the model summaries (“FromModel”). We also compute the prediction intervals more precisely using the residual standard errors from the model outputs (“FromRSE”), which account for the degrees of freedom. The results are summarized below.
# simple approximation using rmse
approx1 <- fc1$mean[1] + c(-1, 1) * 1.96 * rmse1
approx2 <- fc2$mean[1] + c(-1, 1) * 1.96 * rmse2
approx3 <- fc3$mean[1] + c(-1, 1) * 1.96 * rmse3
approx4 <- fc4$mean[1] + c(-1, 1) * 1.96 * rmse4
# more precise estimate using rse
est1 <- fc1$mean[1] + c(-1, 1) * qnorm(0.975) * sqrt(fc1$model$sigma2)
est2 <- fc2$mean[1] + c(-1, 1) * qnorm(0.975) * sqrt(fc2$model$sigma2)
est3 <- fc3$mean[1] + c(-1, 1) * qnorm(0.975) * sqrt(fc3$model$sigma2)
est4 <- fc4$mean[1] + c(-1, 1) * qnorm(0.975) * sqrt(fc4$model$sigma2)
rbind("Simple-FromModel" = c(fc1$lower[1,2], fc1$upper[1,2], fc2$lower[1,2], fc2$upper[1,2]),
"Simple-FromRMSE" = c(approx1, approx2),
"Simple-FromRSE" = c(est1, est2),
"Holt-FromModel" = c(fc3$lower[1,2], fc3$upper[1,2], fc4$lower[1,2], fc4$upper[1,2]),
"Holt-FromRMSE" = c(approx3, approx4),
"Holt-FromRSE" = c(est3, est4)) %>%
kable(caption = "Comparison of 95% prediction intervals",
col.names= c("Paper-Lower", "Paper-Upper", "Hard-Lower", "Hard-Upper"))
| Paper-Lower | Paper-Upper | Hard-Lower | Hard-Upper | |
|---|---|---|---|---|
| Simple-FromModel | 138.8670 | 275.3523 | 174.7799 | 304.3403 |
| Simple-FromRMSE | 141.1798 | 273.0395 | 176.9753 | 302.1449 |
| Simple-FromRSE | 138.8670 | 275.3523 | 174.7799 | 304.3403 |
| Holt-FromModel | 143.9130 | 275.0205 | 192.9222 | 307.4256 |
| Holt-FromRMSE | 148.4384 | 270.4951 | 196.8745 | 303.4733 |
| Holt-FromRSE | 143.9130 | 275.0205 | 192.9222 | 307.4256 |
As expected, there are slight differences between the model output and the estimates using the RMSE values; however, once we use the RSE instead of the RMSE, our prediction intervals are identical to those from the model output.
eggs time seriesFirst we plot the time series. It is evident that there is a strong downward trend, which we will account for using Holt’s linear trend model. Since we will consider a Box-Cox transformation when we experiment with the linear trend models, we also show the transformed time series on the same plot. The optimal parameter for the Box-Cox transformation is \(\lambda = 0.40\).
BoxCox.lambda(eggs)
## [1] 0.3956183
cbind(Original = eggs, "Box-Cox" = BoxCox(eggs, lambda = "auto")) %>%
autoplot(facet = TRUE) +
labs(title = "U.S. price of one dozen eggs (constant dollars)",
subtitle = "1900-1993",
x = "Year", y = "")
We consider 6 options for using the linear trend model:
In all cases, the holt function will estimate the optimal choice of the parameters \(\alpha\) and \(\beta\) (for all models), as well as \(\phi\) (for the models with damped trend). We set \(h=100\) in order to clearly see the differences in the forecasts produced by the four models. From the plot below, we see immediately that the Holt model shouldn’t be used, since it projects prices that go negative.
# basic holt
eggsfc1 <- holt(eggs, h = 100)
# holt with box-cox
eggsfc2 <- holt(eggs, h = 100,
lambda = "auto", biasadj = FALSE)
# holt with box-cox with bias-adjustment
eggsfc2a <- holt(eggs, h = 100,
lambda = "auto", biasadj = TRUE)
# holt with damped trend
eggsfc3 <- holt(eggs, h = 100,
damped = TRUE)
# holt with box-cox & damped trend
eggsfc4 <- holt(eggs, h = 100,
lambda = "auto", biasadj = FALSE,
damped = TRUE)
# holt with box-cox with bias-adjustment & damped trend
eggsfc4a <- holt(eggs, h = 100,
lambda = "auto", biasadj = TRUE,
damped = TRUE)
# plot all forecasts
autoplot(eggs) +
autolayer(eggsfc1, series = "Holt", PI = FALSE, lty = 1, lwd = 1.5) +
autolayer(eggsfc2, series = "BoxCox", PI = FALSE, lty = 2, lwd = 1.5) +
autolayer(eggsfc2a, series = "BC-bias", PI = FALSE, lty = 3, lwd = 1.5) +
autolayer(eggsfc3, series = "Damped", PI = FALSE, lty = 1, lwd = 1.5) +
autolayer(eggsfc4, series = "Damped-BC", PI = FALSE, lty = 2, lwd = 1.5) +
autolayer(eggsfc4a, series = "Damped-BC-bias", PI = FALSE, lty = 3, lwd = 1.5) +
labs(title = "U.S. price of one dozen eggs (constant dollars)",
subtitle = "1900-1993 with 100 year forecast",
x = "Year", y = "") +
geom_hline(yintercept = 0, lty = 3, lwd = 1)
Taking a closer look, we observe that:
# take a closer look
autoplot(eggs) +
autolayer(eggsfc1, series = "Holt", PI = FALSE, lty = 1, lwd = 1.5) +
autolayer(eggsfc2, series = "BoxCox", PI = FALSE, lty = 2, lwd = 1.5) +
autolayer(eggsfc2a, series = "BC-bias", PI = FALSE, lty = 3, lwd = 1.5) +
autolayer(eggsfc3, series = "Damped", PI = FALSE, lty = 1, lwd = 1.5) +
autolayer(eggsfc4, series = "Damped-BC", PI = FALSE, lty = 2, lwd = 1.5) +
autolayer(eggsfc4a, series = "Damped-BC-bias", PI = FALSE, lty = 3, lwd = 1.5) +
labs(title = "U.S. price of one dozen eggs (constant dollars)",
subtitle = "1960-1993 with 30 year forecast",
x = "Year", y = "") +
geom_hline(yintercept = 0, lty = 3, lwd = 1) +
coord_cartesian(xlim = c(1960, 2023),
ylim = c(-50, 200))
Based on the RMSE metrics below, the linear model with Box-Cox transformation and bias-adjustment performs the best, with RMSE = 26.39.
c(Holt = accuracy(eggsfc1)[2],
BoxCox = accuracy(eggsfc2)[2],
"BC-bias" = accuracy(eggsfc2a)[2],
Damped = accuracy(eggsfc3)[2],
"Damped-BC" = accuracy(eggsfc4)[2],
"Damped-BC-bias" = accuracy(eggsfc4a)[2])
## Holt BoxCox BC-bias Damped Damped-BC
## 26.58219 26.39376 26.38689 26.54019 26.53321
## Damped-BC-bias
## 26.58589
For the Australian retail data, I chose the 7th variable column in the retail dataset, which relates to “Turnover - New South Wales - Hardware, building and garden supplies retailing”. Let’s plot the time series.
retaildata <- readxl::read_excel("../Week2/retail.xlsx", skip=1)
colID <- colnames(retaildata)[8]
myts <- ts(retaildata[ , colID], frequency=12, start=c(1982,4))
autoplot(myts) +
labs(title = "Turnover - NSW - Hardware, building & garden supplies retailing",
subtitle = "Monthly 1982/04 - 2013/12",
x = "Year", y = "")
From the times series plot, it appears that the amplitude of the seasonal pattern scales with the level of the time series. This suggests that multiplicative seasonality will be more effective in modeling the time series.
We use the hw function to fit Holt-Winters seasonal models (with multiplicative seasonality), both with and without damping. We plot the original data along with 24-month forecasts from both models.
retailfc1 <- hw(myts, seasonal = "multiplicative")
retailfc2 <- hw(myts, seasonal = "multiplicative", damped = TRUE)
autoplot(myts) +
autolayer(retailfc1, series = "HW", PI = FALSE) +
autolayer(retailfc2, series = "Damped", PI = FALSE) +
labs(title = "Turnover - NSW - Hardware, building & garden supplies retailing",
subtitle = "Time series with 2-year forecast",
x = "Year", y = "")
Below we display the forecasts in close-up to see the differences between the two models. Overall, the two models produce forecasts that are very close to each other, and both forecasts do a good job of replicating the historical seasonality of the time series.
# plot close-up
autoplot(myts) +
autolayer(retailfc1, series = "HW", PI = FALSE) +
autolayer(retailfc2, series = "Damped", PI = FALSE) +
labs(title = "Turnover - NSW - Hardware, building & garden supplies retailing",
subtitle = "Close-up: 2011-2013 with 2-year forecast ",
x = "Year", y = "") +
geom_vline(xintercept = 2014, lty = 3, lwd = 1) +
coord_cartesian(xlim = c(2011, 2016),
ylim = c(250, 450))
Over the training set, the training RMSEs of both models are very close, although the damped model RMSE is slightly lower (13.15) than the Holt-Winters model RMSE (13.17).
c(HW = accuracy(retailfc1)[2], Damped = accuracy(retailfc2)[2])
## HW Damped
## 13.17456 13.15306
#summary(retailfc1)
#summary(retailfc2)
Let’s compare some other metrics between the two models, including the residual standard error, AIC, corrected AIC, and BIC. Across all these metrics, the damped Holt-Winters model is stronger than the basic Holt-Winters model.
hw <- c(accuracy(retailfc1)[2], sqrt(retailfc1$model$sigma2),
retailfc1$model$aic, retailfc1$model$aicc, retailfc1$model$bic)
damped <- c(accuracy(retailfc2)[2], sqrt(retailfc2$model$sigma2),
retailfc2$model$aic, retailfc2$model$aicc, retailfc2$model$bic)
metrics <- round(cbind(HW = hw, Damped = damped), 3)
rownames(metrics) <- c("RMSE", "RSE", "AIC", "AICc", "BIC")
kable(metrics,
caption = "Comparison of performance metrics")
| HW | Damped | |
|---|---|---|
| RMSE | 13.175 | 13.153 |
| RSE | 0.086 | 0.080 |
| AIC | 4326.553 | 4266.723 |
| AICc | 4328.239 | 4268.612 |
| BIC | 4393.580 | 4337.693 |
We use the checkresiduals function to check the residuals of the damped Holt-Winters model. Unfortunately, the residuals do not resemble white noise, and in fact, they exhibit strong auto-correlation with 1, 6, and 12-month lags. Further, the Ljung-Box test confirms that the residuals do not appear to be white noise.
checkresiduals(retailfc1)
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method
## Q* = 143.08, df = 8, p-value < 2.2e-16
##
## Model df: 16. Total lags used: 24
checkresiduals(retailfc2)
##
## Ljung-Box test
##
## data: Residuals from Damped Holt-Winters' multiplicative method
## Q* = 150.7, df = 7, p-value < 2.2e-16
##
## Model df: 17. Total lags used: 24
We train the model on the data through the end of 2010, and then evaluate the forecast on the timeframe 2011-2013. We compare the training and test RMSE of the damped Holt-Winters model against the seasonal naive model.
# training set: thru 2010/12
myts.train <- window(myts, end = c(2010, 12))
# HW model: forecast 36 months through 2013/12 (end of time series)
hw.fc <- hw(myts.train,
seasonal = "multiplicative",
damped = TRUE,
h = 36)
hw.rmse <- accuracy(hw.fc, myts)[ , 2]
# seasonal naive model
snaive.fc <- snaive(myts.train, h=36)
snaive.rmse <- accuracy(snaive.fc, myts)[ , 2]
# rmse
cbind("Damped HW" = hw.rmse, "Seas. naive" = snaive.rmse) %>%
kable(caption = "RMSE comparison")
| Damped HW | Seas. naive | |
|---|---|---|
| Training set | 13.14634 | 26.30758 |
| Test set | 23.10869 | 20.91121 |
Based on the RMSE metrics above, the damped Holt-Winters model performs much better than the seasonal naive model on the training set (RMSE of 13.1 vs. 26.3) but performs worse on the test set (23.1 vs. 20.9). Let’s inspect the forecasts against the dataset in the plots below. It’s hard to distinguish forecasts from the models, so we also take a close-up of the forecasts.
# plot
autoplot(myts) +
autolayer(hw.fc, series = "Damped-HW", PI = FALSE) +
autolayer(snaive.fc, series = "Seas-Naive", PI = FALSE) +
labs(title = "Turnover - NSW - Hardware, building & garden supplies retailing",
subtitle = "1982-2013 with 3-year forecast",
x = "Year", y = "") +
geom_vline(xintercept = 2011, lty = 3, lwd = 1)
# close-up plot
autoplot(myts) +
autolayer(hw.fc, series = "Damped-HW", PI = FALSE) +
autolayer(snaive.fc, series = "Seas-Naive", PI = FALSE) +
autolayer(fitted(hw.fc), series = "DampedHW") +
autolayer(fitted(snaive.fc), series = "SeasNaive") +
labs(title = "Turnover - NSW - Hardware, building & garden supplies retailing",
subtitle = "Close-up: 2008-2013 with 3-year forecast",
x = "Year", y = "") +
geom_vline(xintercept = 2011, lty = 3, lwd = 1) +
coord_cartesian(xlim = c(2008, 2013),
ylim = c(225, 425))
It is apparent in the close-up plot that the seasonal naive model performs poorly (and much worse than the damped Holt-Winters) on the training set (before 2011). However, on the test set (after 2011), we can see that the damped Holt-Winters model generally under-predicts actuals and performs worse than the seasonal naive model.
In this exercise, we try a different approach on the Australian retail data from Exercise 7.8. The approach outlined in the question includes the following steps:
We can use the stlf function to accomplish these steps, by setting the arguments to:
h = 36 (test data from 2011/01 through 2013/12)s.window = 13 for seasonal estimation windowt.window = default value for trend estimation windowlambda = “auto” for Box-Coxmethod = “ets”robust = TRUE or FALSE for robust loess fitting (we’ll try both)# training set thru 2010/12, test set thru 2013/12
# stl fit on training set
stl.fc <- stlf(myts.train,
h = 36,
s.window = 13,
lambda = "auto",
method = "ets",
robust = FALSE)
# model summary
stl.fc$model
## ETS(A,N,N)
##
## Call:
## ets(y = x, model = etsmodel, allow.multiplicative.trend = allow.multiplicative.trend)
##
## Smoothing parameters:
## alpha = 0.7218
##
## Initial states:
## l = 68.0462
##
## sigma: 11.1992
##
## AIC AICc BIC
## 3686.948 3687.019 3698.479
# try with robust fitting
stl2.fc <- stlf(myts.train,
h = 36,
s.window = 13,
lambda = "auto",
method = "ets",
robust = TRUE)
Now we compare accuracy measures for this model (both with and without robust fitting) to the models considered in Exercise 7.8. It appears that the seasonal naive model again performs the best on the test set. Of the two STL with ETS models, the version without robust fitting performs better on both the training and test sets.
# rmse
stl.rmse <- accuracy(stl.fc, myts)[ , 2]
stl2.rmse <- accuracy(stl2.fc, myts)[ , 2]
# comparison
cbind("Damped HW" = hw.rmse,
"Seas. naive" = snaive.rmse,
"STL/ETS" = stl.rmse,
"STL/ETS/robust" = stl2.rmse) %>%
kable(caption = "RMSE comparison")
| Damped HW | Seas. naive | STL/ETS | STL/ETS/robust | |
|---|---|---|---|---|
| Training set | 13.14634 | 26.30758 | 10.76922 | 12.03861 |
| Test set | 23.10869 | 20.91121 | 23.97219 | 27.87913 |
we can visualize the forecasts below, comparing the STL with ETS models (with and without robust fitting) to the seasonal naive model. It appears the seasonal naive model is more responsive to changes in the observations, although it has a tendency to over-shoot on the downside.
# close-up plot
autoplot(myts, lwd = 1) +
autolayer(snaive.fc, series = "Seas-Naive", PI = FALSE) +
autolayer(stl.fc, series = "STL/ETS", PI = FALSE) +
autolayer(stl2.fc, series = "STL/ETS/robust", PI = FALSE) +
# autolayer(fitted(snaive.fc), series = "SeasNaive") +
# autolayer(fitted(stl.fc), series = "STLETS") +
# autolayer(fitted(stl2.fc), series = "STLETSrob") +
labs(title = "Turnover - NSW - Hardware, building & garden supplies retailing",
subtitle = "Close-up: 2008-2013 with 3-year forecast",
x = "Year", y = "") +
geom_vline(xintercept = 2011, lty = 3, lwd = 1) +
coord_cartesian(xlim = c(2008, 2013),
ylim = c(225, 425))