Consider the pigs
series — the number of pigs slaughtered in Victoria each month.
ses()
function in R to find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months.## Simple exponential smoothing
##
## Call:
## ses(y = pigs, h = 4)
##
## Smoothing parameters:
## alpha = 0.2971
##
## Initial states:
## l = 77260.0561
##
## sigma: 10308.58
##
## AIC AICc BIC
## 4462.955 4463.086 4472.665
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Sep 1995 98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995 98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995 98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995 98816.41 83958.37 113674.4 76092.99 121539.8
\(\alpha = 0.2971\) and \(\ell_0 = 77260.0561\)
## [1] 78679.97
## [1] 118952.8
## [1] 78611.97
## [1] 119020.8
The intervals are close but not identical.
Data set books
contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.
autoplot(books) +
ylab("Books") + xlab("Time") +
ggtitle("Daily Sales of Paperback and Hardcover Books")
Overall there are more sales in Hardcover books than do Paperback books. However, both types of books have a pattern resembling a positive trend. The peaks and troughs happens at irregular intervals therefore there is not any reason to conclude there is seasonality present.
paperback <- ses(books[,"Paperback"], h=4)
hardcover <- ses(books[, "Hardcover"], h=4)
autoplot(books) +
autolayer(paperback) +
autolayer(hardcover) +
ylab("Books)") + xlab("Time")
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 7.18 33.64 27.84 0.47 15.58 0.7 -0.21
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 9.17 31.93 26.77 2.64 13.39 0.8 -0.14
We will continue with the daily sales of paperback and hardcover books in data set books
.
paperback2 <- holt(books[,"Paperback"], h=4)
hardcover2 <- holt(books[, "Hardcover"], h=4)
paperback2
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 209.4668 166.6035 252.3301 143.9130 275.0205
## 32 210.7177 167.8544 253.5811 145.1640 276.2715
## 33 211.9687 169.1054 254.8320 146.4149 277.5225
## 34 213.2197 170.3564 256.0830 147.6659 278.7735
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 250.1739 212.7390 287.6087 192.9222 307.4256
## 32 253.4765 216.0416 290.9113 196.2248 310.7282
## 33 256.7791 219.3442 294.2140 199.5274 314.0308
## 34 260.0817 222.6468 297.5166 202.8300 317.3334
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -3.72 31.14 26.18 -5.51 15.58 0.66 -0.18
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -0.14 27.19 23.16 -2.11 12.16 0.69 -0.03
The RMSE scores are lower with the holt method for both paperback and hardcover books. This is an obvious improvement.
The RMSE scores are lower with the holt method so this method is better.
Paperback
s_p <- round(accuracy(paperback2), 2)[2]
cat( "RMSE:",
paperback2$mean[1]-(1.96*s_p),
paperback2$mean[1]+(1.96*s_p))
## RMSE: 148.4324 270.5012
cat("\nSES:",
ses(books[, "Paperback"], level = 95, h = 4)$lower[1],
ses(books[, "Paperback"], level = 95, h = 4)$upper[1])
##
## SES: 138.867 275.3523
cat("\nHOLT:",
holt(books[, "Paperback"], level = 95, h = 4)$lower[1],
holt(books[, "Paperback"], level = 95, h = 4)$upper[1])
##
## HOLT: 143.913 275.0205
Hardcover
s_p <- round(accuracy(paperback2), 2)[2]
cat( "RMSE:",
hardcover2$mean[1]-(1.96*s_p),
hardcover2$mean[1]+(1.96*s_p))
## RMSE: 189.1395 311.2083
cat("\nSES:",
ses(books[, "Hardcover"], level = 95, h = 4)$lower[1],
ses(books[, "Hardcover"], level = 95, h = 4)$upper[1])
##
## SES: 174.7799 304.3403
cat("\nHOLT:",
holt(books[, "Hardcover"], level = 95, h = 4)$lower[1],
holt(books[, "Hardcover"], level = 95, h = 4)$upper[1])
##
## HOLT: 192.9222 307.4256
The intervals I produced are closer to the ones derived from the holt method.
For this exercise use data set eggs
, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt()
function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.
[Hint: use h=100
when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]
Damp trend
e1 <- holt(eggs, h=100)
e2 <- holt(eggs, damped=TRUE, h=100)
e3 <- holt(eggs, lambda = "auto", h=100) #-- lambda= 0.3956
autoplot(eggs) +
autolayer(e1, series="Holt's method", PI=FALSE) +
autolayer(e2, series="Damped Holt's method", PI=FALSE) +
autolayer(e3, series="Box-Cox", PI=FALSE)+
ggtitle("Forecasts from Holt's method") + xlab("Year") +
ylab("Eggs") +
guides(colour=guide_legend(title="Forecast"))
Which model gives the best RMSE?
## ME RMSE MAE MPE MAPE MASE
## Training set 0.04499087 26.58219 19.18491 -1.142201 9.653791 0.9463626
## ACF1
## Training set 0.01348202
## ME RMSE MAE MPE MAPE MASE
## Training set -2.891496 26.54019 19.2795 -2.907633 10.01894 0.9510287
## ACF1
## Training set -0.003195358
## ME RMSE MAE MPE MAPE MASE
## Training set 0.7736844 26.39376 18.96387 -1.072416 9.620095 0.9354593
## ACF1
## Training set 0.03887152
Based on the accuracy scores provided, the model with box-cox transformation parameter is best.
Recall your retail time series data (from Exercise 3 in Section 2.10).
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349399C"], frequency=12, start=c(1982,4)) #clothing
autoplot(myts)
The multiplicative method is preferred when the seasonal variations are changing proportional to the level of the series which is happening in this case.
fit1 <- hw(myts, seasonal="multiplicative")
fit2 <- hw(myts, damped = T, seasonal="multiplicative")
autoplot(myts) +
autolayer(fit1, series="HW multiplicative forecasts", PI=FALSE) +
autolayer(fit2, series="HW multi damped",
PI=FALSE) +
xlab("Year") +
ylab("Clothing") +
ggtitle("Retail: Turnover") +
guides(colour=guide_legend(title="Forecast"))
## ME RMSE MAE MPE MAPE MASE
## Training set 0.1416191 12.83016 9.730449 -0.3289084 4.447693 0.540727
## ACF1
## Training set 0.09791563
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 1.091016 12.49991 9.503303 0.2091735 4.405131 0.5281044 0.05756364
The damped multiplicative method is better due to the lower RMSE score.
##
## Box-Ljung test
##
## data: fit2$residuals
## X-squared = 91.149, df = 24, p-value = 9.296e-10
The p-value is less than 0.05 therefore this suggests that that the data is not from white noise.
myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)
fc1 <- hw(myts.train, seasonal="multiplicative")
accuracy(fc1, myts.test)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.4181467 12.62076 9.483714 -0.5602421 4.615519 0.5717011
## Test set -5.6475034 14.47224 10.911929 -1.8801266 3.191901 0.6577973
## ACF1 Theil's U
## Training set 0.06436742 NA
## Test set 0.54835802 0.2212264
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 9.007207 21.13832 16.58859 4.224080 7.494415 1.000000 0.5277855
## Test set 10.362500 21.50499 18.99583 2.771495 5.493632 1.145115 0.7420700
## Theil's U
## Training set NA
## Test set 0.3223094
Yes, the holt method does way better than the seasonal naive method.
For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
lambda <- BoxCox.lambda(myts.train)
stl_bc_myts.train <- stlf(myts.train, lambda = lambda)
autoplot(myts.train, series = "train") +
autolayer(stl_bc_myts.train, series = 'STL')
## ME RMSE MAE MPE MAPE MASE
## Training set -0.0869377 10.96624 8.08034 -0.1557821 3.912306 0.4871023
## Test set -13.1422085 17.30462 13.62739 -3.9151171 4.049466 0.8214921
## ACF1 Theil's U
## Training set 0.05392764 NA
## Test set 0.46703067 0.2568401
## ETS(M,A,N)
##
## Call:
## ets(y = seasadj(decompose(myts.train, "multiplicative")))
##
## Smoothing parameters:
## alpha = 0.3018
## beta = 1e-04
##
## Initial states:
## l = 99.4918
## b = 0.7645
##
## sigma: 0.0581
##
## AIC AICc BIC
## 3733.387 3733.564 3752.605
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.09743706 12.71705 9.377922 -0.3826849 4.445302 0.5612921
## ACF1
## Training set 0.1039097
autoplot(myts.train, series = 'Train') +
autolayer(forecast(ets_myts.train, h = 24, PI=F), series = "Forecast")
## ME RMSE MAE MPE MAPE MASE
## Training set -0.09743706 12.71705 9.377922 -0.3826849 4.445302 0.5612921
## Test set -6.43858962 67.52464 43.387035 -4.6908003 11.580734 2.5968227
## ACF1 Theil's U
## Training set 0.1039097 NA
## Test set 0.1414676 0.9691795
The previous forecasts (STL) are better due to their RMSE score being lower than the one produced by ETS.