require(fpp2)
## Loading required package: fpp2
## Loading required package: ggplot2
## Loading required package: forecast
## Loading required package: fma
## Loading required package: expsmooth
pigs series – the number of pigs slaughtered in Victoria each month.data(pigs)
ses() function in R to find the optimal values of \(\alpha\) and \(l_0\), and generate forecasts for the next four months.pc <- ses(pigs, h = 4)
summary(pc)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = pigs, h = 4)
##
## Smoothing parameters:
## alpha = 0.2971
##
## Initial states:
## l = 77260.0561
##
## sigma: 10308.58
##
## AIC AICc BIC
## 4462.955 4463.086 4472.665
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249
## ACF1
## Training set 0.01282239
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Sep 1995 98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995 98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995 98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995 98816.41 83958.37 113674.4 76092.99 121539.8
We can see that ses() has calculated the \(\alpha = 0.2971\) and \(l_0 = 77260.1\)
If we graph out the forecasts, we can see that the prediction seems to generally cover the wide intervals preceeding it. Moreover, the small \(\alpha\) means that the fitted prediction is much smoother than the original data.
autoplot(pc) +
autolayer(fitted(pc), series = "Fitted") +
ylab("Number of Pigs slaughtered")
s <- sd(pc$residuals)
s
## [1] 10273.69
Using just the sd function on the residuals, we get s = 10273.7, which is a little larger than the RMSE found in pc, which was 10253.6.
p.low <- pc$mean - 1.96 * s
p.high <- pc$mean + 1.96 * s
p.low
## Sep Oct Nov Dec
## 1995 78679.97 78679.97 78679.97 78679.97
p.high
## Sep Oct Nov Dec
## 1995 118952.8 118952.8 118952.8 118952.8
pc
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Sep 1995 98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995 98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995 98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995 98816.41 83958.37 113674.4 76092.99 121539.8
We’ll now complete the equation, utilizing the mean. Our final interval becomes 78679.97 and 118952.8. That interval is narrower than what the ses function determined, at least for September.
books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.data(books)
autoplot(books) +
xlab("Day") + ylab("Sales")
There’s no discernible seasonality to the data, although there is a general upward trend to both Paperback and Hardcover as the days go on. Paperback suffers from huge variations in the beginning, while Hardcover has more periods of high sales before suffering a drop. This could be because hardcovers, especially new hardcovers, typically have aggressive placements in a bookstore and also benefit from price cuts.
ses() function to forecast each series, and plot the forecastspback <- ses(books[, 1])
hcover <- ses(books[, 2])
summary(pback)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = books[, 1])
##
## Smoothing parameters:
## alpha = 0.1685
##
## Initial states:
## l = 170.8271
##
## sigma: 34.8183
##
## AIC AICc BIC
## 318.9747 319.8978 323.1783
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303
## ACF1
## Training set -0.2117522
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 207.1097 162.4882 251.7311 138.8670 275.3523
## 32 207.1097 161.8589 252.3604 137.9046 276.3147
## 33 207.1097 161.2382 252.9811 136.9554 277.2639
## 34 207.1097 160.6259 253.5935 136.0188 278.2005
## 35 207.1097 160.0215 254.1979 135.0945 279.1249
## 36 207.1097 159.4247 254.7946 134.1818 280.0375
## 37 207.1097 158.8353 255.3840 133.2804 280.9389
## 38 207.1097 158.2531 255.9663 132.3899 281.8294
## 39 207.1097 157.6777 256.5417 131.5099 282.7094
## 40 207.1097 157.1089 257.1105 130.6400 283.5793
summary(hcover)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = books[, 2])
##
## Smoothing parameters:
## alpha = 0.3283
##
## Initial states:
## l = 149.2861
##
## sigma: 33.0517
##
## AIC AICc BIC
## 315.8506 316.7737 320.0542
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887
## ACF1
## Training set -0.1417763
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 239.5601 197.2026 281.9176 174.7799 304.3403
## 32 239.5601 194.9788 284.1414 171.3788 307.7414
## 33 239.5601 192.8607 286.2595 168.1396 310.9806
## 34 239.5601 190.8347 288.2855 165.0410 314.0792
## 35 239.5601 188.8895 290.2306 162.0662 317.0540
## 36 239.5601 187.0164 292.1038 159.2014 319.9188
## 37 239.5601 185.2077 293.9124 156.4353 322.6848
## 38 239.5601 183.4574 295.6628 153.7584 325.3618
## 39 239.5601 181.7600 297.3602 151.1625 327.9577
## 40 239.5601 180.1111 299.0091 148.6406 330.4795
We can see the \(\alpha\) for Paperback is much smaller than the \(\alpha\) for Hardcover, so we should expect a smoother curve when plotted. Also, as we saw in the initial data, Hardcover has a much lower \(l_0\) than Paperback.
autoplot(pback) +
autolayer(fitted(pback), series = "Fitted Paperback") +
xlab("Days") + ylab("Sales") + ggtitle("Paperback Prediction")
autoplot(hcover) +
autolayer(fitted(hcover), series = "Fitted Hardcover") +
xlab("Days") + ylab("Sales") + ggtitle("Hardcover Prediction")
Indeed, the Paperback graph is much smoother, although in both cases the predictions have a sweeping 95% CI. Still, it looks as though Hardcovers will continue their strong, upward trend. The model seems more uncertain about Paperbacks.
Although the ses function calculates the RMSE, we can attempt to double check with the formula \(RMSE = \sqrt{\frac{\Sigma(residual^2)}{n - 1}}\)
rmse <- function(l){
len <- length(l$residuals)
ans <- sqrt((sum(l$residuals^2)/(len - 1)))
return(ans)
}
rmse(pback)
## [1] 34.21273
rmse(hcover)
## [1] 32.47688
Our calculation for RMSE Paperback is 34.21273, while ses reports 33.63769. For RMSE Hardcover we found 32.47688, while ses reports 31.93101. In both instances our calculation was slightly higher, but still relatively close.
paperback and hardback series and compute four-day forecasts in each case.holt.pback <- holt(books[, 1], h = 4)
holt.hcover <- holt(books[, 2], h = 4)
summary(holt.pback)
##
## Forecast method: Holt's method
##
## Model Information:
## Holt's method
##
## Call:
## holt(y = books[, 1], h = 4)
##
## Smoothing parameters:
## alpha = 1e-04
## beta = 1e-04
##
## Initial states:
## l = 170.699
## b = 1.2621
##
## sigma: 33.4464
##
## AIC AICc BIC
## 318.3396 320.8396 325.3456
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
## ACF1
## Training set -0.1750792
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 209.4668 166.6035 252.3301 143.9130 275.0205
## 32 210.7177 167.8544 253.5811 145.1640 276.2715
## 33 211.9687 169.1054 254.8320 146.4149 277.5225
## 34 213.2197 170.3564 256.0830 147.6659 278.7735
summary(holt.hcover)
##
## Forecast method: Holt's method
##
## Model Information:
## Holt's method
##
## Call:
## holt(y = books[, 2], h = 4)
##
## Smoothing parameters:
## alpha = 1e-04
## beta = 1e-04
##
## Initial states:
## l = 147.7935
## b = 3.303
##
## sigma: 29.2106
##
## AIC AICc BIC
## 310.2148 312.7148 317.2208
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
## ACF1
## Training set -0.03245186
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 250.1739 212.7390 287.6087 192.9222 307.4256
## 32 253.4765 216.0416 290.9113 196.2248 310.7282
## 33 256.7791 219.3442 294.2140 199.5274 314.0308
## 34 260.0817 222.6468 297.5166 202.8300 317.3334
As we see above, the RMSE for Holt’s method is 31.13692 for Paperback and 27.19358 for Hardcover. These are much lower than the 33.63769 and 31.93101 respectively for ses. This would generally indicate that Holt’s method is a better fit than ses.
autoplot(pback) +
autolayer(fitted(pback), series = "SES Paperback") +
xlab("Days") + ylab("Sales") + ggtitle("SES Paperback")
autoplot(holt.pback) +
autolayer(fitted(holt.pback), series = "Holt Paperback") +
xlab("Days") + ylab("Sales") + ggtitle("Holt's Method Paperback")
autoplot(hcover) +
autolayer(fitted(hcover), series = "SES Hardcover") +
xlab("Days") + ylab("Sales") + ggtitle("Hardcover Prediction")
autoplot(holt.hcover) +
autolayer(fitted(holt.hcover), series = "Holt Hardcover") +
xlab("Days") + ylab("Sales") + ggtitle("Holt's Method Hardcover")
Holt’s method smooths whatever variation and seasonality that might be in the data and turns it into a simple linear slope. This does a good job at capturing the trend, though not the nuance of the data itself. Despite that, it gives a much tighter interval for prediction, which would probably be a lot more beneficial to the person running it. The range for ses, while probably having a higher likelihood of containing the actual value, is so broad that it’s almost pointless in comparison.
ses and holt.We’ll do what we did above, although instead of calculating our own standard deviation, we’ll use the RMSE values as outlined in the question.
hpci.low <- holt.pback$mean[1] - 1.96 * 31.13692
hpci.high <- holt.pback$mean[1] + 1.96 * 31.13692
hhci.low <- holt.hcover$mean[1] - 1.96 * 27.19358
hhci.high <- holt.hcover$mean[1] + 1.96 * 27.19358
spci.low <- pback$mean[1] - 1.96 * 33.63769
spci.high <- pback$mean[1] + 1.96 * 33.63769
shci.low <- pback$mean[1] - 1.96 * 31.93101
shci.high <- pback$mean[1] + 1.96 * 31.93101
hpci <- c(hpci.low, hpci.high)
hhci <- c(hhci.low, hhci.high)
spci <- c(spci.low, spci.high)
shci <- c(shci.low, shci.high)
con.int <- data.frame(hpci, spci, hhci, shci)
con.int
## hpci spci hhci shci
## 1 148.4384 141.1798 196.8745 144.5249
## 2 270.4951 273.0395 303.4733 269.6944
As seen from the prediction graphs, the confidence interval for Holt’s method is tighter than ses, although not by as much as we’d originally thought. It seems as though the linearity of Holt’s graph, and the fact that the confidence intervals are accordingly slanted, helps shape how we interpreted it. Also, it should be noted that our ses attempt did the default 10, rather than 4 predictions. Regardless, Holt’s is an improvement over ses.
eggs, the price of a dozen eggs in the United States from 1900-1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an inuition of what each argument is doing to the forecasts. [Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.] Which model gives the best RMSE?def <- holt(eggs, h = 100)
autoplot(eggs) +
autolayer(def, series = "Default Holt") +
ggtitle("Default Holt Eggs") + xlab("Years") + ylab("Egg Prices")
def.damp <- holt(eggs, h = 100, damped = TRUE)
autoplot(eggs) +
autolayer(def.damp, series = "Damped Holt") +
ggtitle("Default Damped Holt Eggs") + xlab("Years") + ylab("Egg Prices")
Well, between the two default options, the damped version performs slightly better. For one, it prevents the trendline from dipping into the negative, which the default doesn’t account for. For another, it has an RMSE score of 26.54019, which is barely better than 26.58219.
def.lam <- holt(eggs, lambda = "auto", h = 100)
autoplot(eggs) +
autolayer(def.lam, series = "BoxCox Holt") +
ggtitle("Box Cox Transform Holt Eggs") + xlab("Years") + ylab("Egg Prices")
damp.lam <- holt(eggs, lambda = "auto", damped = TRUE, h = 100)
autoplot(eggs) +
autolayer(damp.lam, series = "BoxCox Holt Damped") +
ggtitle("Box Cox Transform Holt Damped Eggs") + xlab("Years") + ylab("Egg Prices")
Again, we see that the damped option does better when we utilize the Box-Cox option. In the default Box-Cox, it doesn’t dip into the negative, but the trend line does seem to hit 0, meaning free eggs for everyone at some point. The damped Box-Cox keeps the price from ever dropping below a certain level above zero, and does a much better job of even keeping the confidence interval from reaching below zero.
The RMSE for the plain Box-Cox is 26.39376, and 26.53321 for the damped Box-Cox. Though the plain Box-Cox has the lowest RMSE score, and thus the best fit, the damped Box-Cox has the better looking graph.
This would be the Australian retail data. From which we need to pick one column to use as a time series.
retaildata <- readxl::read_excel("retail.xlsx", skip = 1)
myts <- ts(retaildata[, "A3349337W"], frequency = 12, start = c(1982, 4))
As the book notes, “Multiplicative decompositions are common with economic time series”, and as this pertains to retail data, it fits. Also, given that we’re working with money, the percentage change in the general trend more appropriate than dealing with absolute differences from adding things together.
fit1 <- hw(myts, seasonal = "multiplicative")
fit2 <- hw(myts, damped = TRUE, seasonal = "multiplicative")
autoplot(myts) +
autolayer(fit1, series = "HW Multiplicative", PI = FALSE) +
autolayer(fit2, series = "HW Multiplicative Damped", PI = FALSE) +
xlab("Year") + ylab("Money") + guides(colour=guide_legend(title="Forecast"))
The graph may be slightly smushed, but there doesn’t seem to be much of a difference between the two, unlike before. The predictions are more or less the same.
The RMSE for the general multiplicative is 13.17456. The RMSE for multiplicative damped is 13.15306. The damped score is lower, meaning that is a better fit. And though the graph doesn’t illustrate a drastic difference, if it is like the linear Holt, then our preference would be for the damped Holt-Winters’.
ggAcf(fit2$residuals)
Unfortunately the residuals don’t resemble white noise. Twelve of the spikes lay outside the bounds, which is nearly half of them. Far more than the 5% maximum.
ggAcf(fit1$residuals)
This holds true that for even the regular multiplicative. This suggests that the retail data, at least for the column we chose, has autocorrelation.
For a refresher, let’s redo the seasonal naive approach.
myts.train <- window(myts, end = c(2010, 12))
myts.test <- window(myts, start = 2011)
snfc <- snaive(myts.train, h = 36)
accuracy(snfc, myts.test)
## ME RMSE MAE MPE MAPE MASE
## Training set 9.460661 26.30758 21.23363 4.655690 12.762886 1.0000000
## Test set 14.094444 20.91121 17.30556 3.802915 4.911945 0.8150068
## ACF1 Theil's U
## Training set 0.8070166 NA
## Test set 0.5265917 0.6200865
The RMSE is looking very high compared to either multiplicative method, but let’s confirm.
hwdfc <- hw(myts.train, damped = TRUE, seasonal = "multiplicative")
accuracy(hwdfc, myts.test)
## ME RMSE MAE MPE MAPE MASE
## Training set 0.1340865 13.14634 9.755892 0.05575759 6.322517 0.4594546
## Test set 18.6615873 25.62828 20.779845 5.02059888 5.706599 0.9786288
## ACF1 Theil's U
## Training set 0.09808033 NA
## Test set 0.66318982 0.8263487
Well, that’s interesting. The test scored worse than the seasonal naive, reporting back 25.62828 vs 20.91121!
autoplot(window(myts, start = 2011))
autoplot(window(myts, start = 2011)) +
autolayer(snfc, series = "Seasonal Naive")
autoplot(window(myts, start = 2011)) +
autolayer(hwdfc, series = "HW Damped")
Looking at the graphs, we see that the seasonal naive method does model the actual data more closely than the damped, which overly smooths the trend too much. So what if we did the regular multiplicative?
hwfc <- hw(myts.train, seasonal = "multiplicative")
accuracy(hwfc, myts.test)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.8547669 13.33918 9.850521 -0.6888481 6.402940 0.4639112
## Test set 1.8107543 15.54315 11.449206 0.1862269 3.243521 0.5392014
## ACF1 Theil's U
## Training set 0.07325471 NA
## Test set 0.64813006 0.5034011
So the regular Holt-Winters’ method does outperform the seasonal naive with an RMSE of 15.54315.
autoplot(window(myts, start = 2011)) +
autolayer(hwfc, series = "HW")
Which still seems a little strange, as the plot looks about as smooth as the damped, all things considered. It could be that the damped method over fit the data.
fit3 <- mstl(myts.train, lambda = "auto")
fit3 <- forecast(fit3)
accuracy(fit3, myts.test)
## ME RMSE MAE MPE MAPE MASE
## Training set -3.322853 11.65468 8.582123 -1.6792818 5.387082 0.4041759
## Test set 4.691740 21.22116 16.596886 0.7924293 4.704230 0.7816319
## ACF1 Theil's U
## Training set 0.07328851 NA
## Test set 0.70821370 0.6896223
autoplot(window(myts, start = 2011)) +
autolayer(fit3, series = "Box-Cox ETS")
Our basic, multiplicative Holt-Winters’ method does beat out the Box-Cox transformed ETS. Interestingly, even the seasonal naive has a better RMSE score. It could be a simple matter of the transformations. mstl does everything automatically, and if we instead spent more time fine tuning things with stl, some version could end up beating Holt-Winters’.