Consider the pigs
series — the number of pigs slaughtered in Victoria each month.
str(pigs)
## Time-Series [1:188] from 1980 to 1996: 76378 71947 33873 96428 105084 ...
Use the ses()
function in R to find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months.
# Using ses for pigs
<- ses(pigs, h=4)
pigs_ses
#summary
summary(pigs_ses)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = pigs, h = 4)
##
## Smoothing parameters:
## alpha = 0.2971
##
## Initial states:
## l = 77260.0561
##
## sigma: 10308.58
##
## AIC AICc BIC
## 4462.955 4463.086 4472.665
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Sep 1995 98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995 98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995 98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995 98816.41 83958.37 113674.4 76092.99 121539.8
Above summary shows the the optimal values of \(\alpha\) and \(\ell_0\) are 0.2971 and 77260.0561 respectively. Using these values forecast is generated for next 4 months.
Next plot shows the forecast from simple exponential smoothing. Also one-step-ahead fitted values are plotted with the data over the period.
autoplot(pigs_ses) +
autolayer(fitted(pigs_ses), series="Fitted") +
ylab("Number of pigs slaughtered in Victoria")
Compute a 95% prediction interval for the first forecast using \(\hat{y} \pm 1.96 \sigma\) where \(\sigma\) is the standard deviation of the residuals. Compare your interval with the interval produced by R.
# 95% prediction interval for the first forecas
<- sd(residuals(pigs_ses))
sd <- c(lower = pigs_ses$mean[1] - 1.96*sd, upper = pigs_ses$mean[1] + 1.96*sd)
ci95 ci95
## lower upper
## 78679.97 118952.84
# By R
<- c(pigs_ses$lower[1, "95%"], pigs_ses$upper[1, "95%"])
ci95_R names(ci95_R) <- c("lower", "upper")
ci95_R
## lower upper
## 78611.97 119020.84
It appears the 95% prediction interval calculated by R is a little wider than the one given by the formula.
Data set books
contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.
head(books)
## Time Series:
## Start = 1
## End = 6
## Frequency = 1
## Paperback Hardcover
## 1 199 139
## 2 172 128
## 3 111 172
## 4 209 139
## 5 161 191
## 6 119 168
Plot the series and discuss the main features of the data.
# plot series
autoplot(books) +
labs(title = "Daily Sales of Paperback and Hardcover Books")
The series has an upward trend but don’t see any seasonality or cyclicity in the plot. Also its only a 30 days of data so difficult to speak about seasonality. Another observation is hardcover sales in better than paperback.
Use the ses()
function to forecast each series, and plot the forecasts.
# Using ses for books
<- ses(books[, 'Paperback'], h=4)
pb_ses <- ses(books[, 'Hardcover'], h=4)
hc_ses
autoplot(books) +
autolayer(pb_ses, series="Paperback", PI=FALSE) +
autolayer(hc_ses, series="Hardcover", PI=FALSE) +
labs(title = "Daily Sales of Paperback and Hardcover Books (ses)")
The simple exponential smoothing plot above shows flat forecast and doesnt appear to capture upward trend.
Compute the RMSE values for the training data in each case.
# RMSE for paperback
round(accuracy(pb_ses)[2], 2)
## [1] 33.64
# RMSE for hardcover
round(accuracy(hc_ses)[2], 2)
## [1] 31.93
RMSE of hardcover for the training data is slightly better than of paperback.
We will continue with the daily sales of paperback and hardcover books in data set books.
Apply Holt’s linear method to the paperback
and hardback
series and compute four-day forecasts in each case.
<- holt(books[, 'Paperback'], h=4)
pb_holt <- holt(books[, 'Hardcover'], h=4)
hc_holt
autoplot(books) +
autolayer(pb_holt, series="Paperback", PI=FALSE) +
autolayer(hc_holt, series="Hardcover", PI=FALSE) +
labs(title = "Daily Sales of Paperback and Hardcover Books (holt)")
Holt’s linear forecast seems better as it is able to capture the upward trend of time series.
Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.
# RMSE for paperback - holt
round(accuracy(pb_holt)[2], 2)
## [1] 31.14
# RMSE for hardcover - holt
round(accuracy(hc_holt)[2], 2)
## [1] 27.19
The RMSEs for paperback and hardcover books sale are improved using holt’s linear method as compared from simple exponential smoothing method. It happens since holt extended simple exponential smoothing to allow the forecasting of data with a trend and we see in above plot of holts, capturing upward trend.
Compare the forecasts for the two series using both methods. Which do you think is best?
# ses and holt comparison for paperback
<- autoplot(pb_ses) +
s1 ylab("paperback book sales") +
labs(title = "ses forecast - paperback")
<- autoplot(pb_holt) +
h1 ylab("paperback book sales") +
labs(title = "holt forecast - paperback")
# ses and holt comparison for hardcover
<- autoplot(hc_ses) +
s2 ylab("hardcover book sales") +
labs(title = "ses forecast - hardcover")
<- autoplot(hc_holt) +
h2 ylab("hardcover book sales") +
labs(title = "holt forecast - hardcover")
::grid.arrange(s1, h2, s1,h2, nrow=2, ncol=2) gridExtra
Compare the forecasts for the two series using the two, Holt’s method is better than simple exponential smoothing method sith holt extended simple exponential smoothing to allow the forecasting of data with a trend. The RMSE is smaller for holt method.
Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses
and holt
.
# 95% prediction interval for the first forecast
<- data.frame(
df Pred_Int = c("Paperback-SES", "Paperback-Holt","Hardcover-SES", "Hardcover-Holt"),
lower = c(pb_ses$mean[1] - 1.96*accuracy(pb_ses)[2],
$mean[1] - 1.96*accuracy(pb_holt)[2],
pb_holt$mean[1] - 1.96*accuracy(hc_ses)[2],
hc_ses$mean[1] - 1.96*accuracy(hc_holt)[2]),
hc_holtupper = c(pb_ses$mean[1] + 1.96*accuracy(pb_ses)[2],
$mean[1] + 1.96*accuracy(pb_holt)[2],
pb_holt$mean[1] + 1.96*accuracy(hc_ses)[2],
hc_ses$mean[1] + 1.96*accuracy(hc_holt)[2])
hc_holt
)
df
## Pred_Int lower upper
## 1 Paperback-SES 141.1798 273.0395
## 2 Paperback-Holt 148.4384 270.4951
## 3 Hardcover-SES 176.9753 302.1449
## 4 Hardcover-Holt 196.8745 303.4733
<- data.frame(
df2 Pred_Int = c("Paperback-SES", "Paperback-Holt","Hardcover-SES", "Hardcover-Holt"),
lower = c(pb_ses$lower[1, "95%"],
$lower[1, "95%"],
pb_holt$lower[1, "95%"],
hc_ses$lower[1, "95%"]),
hc_holtupper = c(pb_ses$upper[1, "95%"],
$upper[1, "95%"],
pb_holt$upper[1, "95%"],
hc_ses$upper[1, "95%"])
hc_holt
)
df2
## Pred_Int lower upper
## 1 Paperback-SES 138.8670 275.3523
## 2 Paperback-Holt 143.9130 275.0205
## 3 Hardcover-SES 174.7799 304.3403
## 4 Hardcover-Holt 192.9222 307.4256
From the interval range above, it is apparent that the interval calculated using RMSE is slightly narrower than from R using holt ans ses methods.
For this exercise use data set eggs
, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt()
function to see how much the forecasts change with the damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.
[Hint: use h = 100
when calling holt()
so you can clearly see the differences between the various options when plotting the forecasts.]
Which model gives the best RMSE?
head(eggs)
## Time Series:
## Start = 1900
## End = 1905
## Frequency = 1
## [1] 276.79 315.42 314.87 321.25 314.54 317.92
autoplot(eggs) +
labs(title = "Price of a dozen eggs in the United States from 1900–1993")
The time series shows the downward trend. It has the frequency as 1 that shows yearly record. We will perform forecast using below methods:
<- 100
h
# holts
<- holt(eggs, h=h)
eggs_holt # holts with damped trend
<- holt(eggs, h=h, damped = T)
eggs_damped # Box-Cox transformation
<- holt(eggs, h=h, lambda = "auto")
eggs_boxcox # Box-Cox with damped trend
<- holt(eggs, h=h, lambda = "auto", damped = T)
eggs_boxcox_d # exponential
<- holt(eggs, h=h, exponential = T)
eggs_exp # exponential with damped trend
<- holt(eggs, h=h, exponential = T, damped = T) eggs_exp_d
# Forcast from holt's
autoplot(eggs_holt)
# Forecast from damped holt's
autoplot(eggs_damped)
# forecast from boxcox transformation
autoplot(eggs_boxcox) +
labs(title = "Forecast using BoxCox Transformation")
# forecast from damped boxcox transformation
autoplot(eggs_boxcox_d) +
labs(title = "Forecast using Damped BoxCox Transformation")
# Forecast using exponential trend
autoplot(eggs_exp)
# Forecast using exponential trend and damped
autoplot(eggs_exp_d)
<- data.frame(
rmse_eggdf Method = c("Holt's", "Damped Holt's","BoxCox", "Damped BoxCox", "Exponential", "Damped Exponential"),
RMSE = c(accuracy(eggs_holt)[2],
accuracy(eggs_damped)[2],
accuracy(eggs_boxcox)[2],
accuracy(eggs_boxcox_d)[2],
accuracy(eggs_exp)[2],
accuracy(eggs_exp_d)[2])
)
rmse_eggdf
## Method RMSE
## 1 Holt's 26.58219
## 2 Damped Holt's 26.54019
## 3 BoxCox 26.39376
## 4 Damped BoxCox 26.53321
## 5 Exponential 26.49795
## 6 Damped Exponential 26.59113
Analyzing all the RMSEs above for all the methods, it appears BoxCox transformation is the lowest (=26.39376). From all the 6 graphs above it is clear too that BoxCox transformation forecast captures the decline trend and good enough among all.
Recall your retail time series data (from Exercise 3 in Section 2.10).
<- readxl::read_excel("retail.xlsx", skip=1)
retaildata <- ts(retaildata[,"A3349627V"], frequency=12, start=c(1982,4)) myts
Why is multiplicative seasonality necessary for this series?
autoplot(myts) +
labs(title="Retail Sales")
The multiplicative method is preferred when the seasonal variations are changing proportional to the level of the series. It appears in above graph that the variability in the series is increasing over years therefore multiplicative seasonality is necessary for the series.
Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
# multiplicative
<- hw(myts, seasonal = "multiplicative")
myts_hw summary(myts_hw)
##
## Forecast method: Holt-Winters' multiplicative method
##
## Model Information:
## Holt-Winters' multiplicative method
##
## Call:
## hw(y = myts, seasonal = "multiplicative")
##
## Smoothing parameters:
## alpha = 0.5253
## beta = 0.0038
## gamma = 0.0743
##
## Initial states:
## l = 44.3692
## b = 0.1934
## s = 0.9684 0.8895 0.9594 1.5454 1.1037 1.0445
## 0.9269 0.923 0.9101 0.8944 0.9305 0.9043
##
## sigma: 0.0542
##
## AIC AICc BIC
## 3620.609 3622.295 3687.637
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.3076124 5.916807 4.201477 0.07243871 3.779909 0.4485563
## ACF1
## Training set -0.03122619
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 2014 264.8837 246.4755 283.2918 236.7308 293.0365
## Feb 2014 233.7293 215.3569 252.1016 205.6312 261.8274
## Mar 2014 259.0735 236.5752 281.5718 224.6653 293.4817
## Apr 2014 251.3886 227.6492 275.1279 215.0823 287.6948
## May 2014 245.7384 220.7899 270.6869 207.5830 283.8939
## Jun 2014 238.1008 212.3348 263.8669 198.6950 277.5067
## Jul 2014 240.9197 213.3164 268.5229 198.7042 283.1352
## Aug 2014 246.5606 216.8117 276.3095 201.0636 292.0576
## Sep 2014 251.6342 219.8029 283.4655 202.9524 300.3160
## Oct 2014 268.4657 232.9920 303.9393 214.2134 322.7179
## Nov 2014 286.0274 246.6729 325.3820 225.8398 346.2151
## Dec 2014 409.4041 350.9061 467.9020 319.9392 498.8689
## Jan 2015 273.2441 232.4002 314.0881 210.7787 335.7096
## Feb 2015 241.0872 203.8497 278.3247 184.1374 298.0371
## Mar 2015 267.2081 224.6357 309.7805 202.0992 332.3170
## Apr 2015 259.2615 216.7199 301.8030 194.1997 324.3232
## May 2015 253.4145 210.6486 296.1804 188.0097 318.8193
## Jun 2015 245.5192 202.9598 288.0785 180.4303 310.6081
## Jul 2015 248.4065 204.2272 292.5859 180.8401 315.9730
## Aug 2015 254.2031 207.8659 300.5404 183.3364 325.0699
## Sep 2015 259.4141 210.9944 307.8338 185.3626 333.4656
## Oct 2015 276.7448 223.8995 329.5900 195.9250 357.5646
## Nov 2015 294.8257 237.2766 352.3748 206.8119 382.8394
## Dec 2015 421.9655 337.8314 506.0995 293.2935 550.6374
# multiplicative damped
<- hw(myts, seasonal = "multiplicative", damped = T)
myts_hwd summary(myts_hwd)
##
## Forecast method: Damped Holt-Winters' multiplicative method
##
## Model Information:
## Damped Holt-Winters' multiplicative method
##
## Call:
## hw(y = myts, seasonal = "multiplicative", damped = T)
##
## Smoothing parameters:
## alpha = 0.5562
## beta = 0.0185
## gamma = 0.0079
## phi = 0.98
##
## Initial states:
## l = 43.9263
## b = 0.0809
## s = 0.9818 0.8807 1.0092 1.5184 1.061 0.9992
## 0.9525 0.9305 0.9055 0.8993 0.9149 0.947
##
## sigma: 0.0526
##
## AIC AICc BIC
## 3596.970 3598.860 3667.941
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.3511596 5.628031 4.018041 0.1776603 3.685373 0.4289724
## ACF1
## Training set -0.05517684
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 2014 264.5113 246.6953 282.3274 237.2640 291.7586
## Feb 2014 231.7667 213.7561 249.7774 204.2218 259.3116
## Mar 2014 256.6838 234.2502 279.1173 222.3746 290.9929
## Apr 2014 248.5371 224.5225 272.5517 211.8100 285.2642
## May 2014 240.6016 215.2147 265.9886 201.7756 279.4276
## Jun 2014 234.6576 207.8710 261.4442 193.6910 275.6242
## Jul 2014 236.8817 207.8416 265.9217 192.4687 281.2946
## Aug 2014 243.1918 211.3633 275.0203 194.5143 291.8693
## Sep 2014 248.9366 214.3251 283.5481 196.0029 301.8703
## Oct 2014 262.3951 223.7986 300.9916 203.3669 321.4233
## Nov 2014 277.6314 234.5812 320.6817 211.7918 343.4711
## Dec 2014 395.9630 331.4370 460.4890 297.2790 494.6469
## Jan 2015 264.7371 219.4790 309.9952 195.5208 333.9535
## Feb 2015 231.9607 190.5023 273.4191 168.5556 295.3658
## Mar 2015 256.8944 208.9939 304.7948 183.6370 330.1518
## Apr 2015 248.7370 200.4449 297.0290 174.8807 322.5933
## May 2015 240.7913 192.1990 289.3836 166.4758 315.1068
## Jun 2015 234.8390 185.6580 284.0199 159.6232 310.0548
## Jul 2015 237.0611 185.6149 288.5074 158.3808 315.7415
## Aug 2015 243.3724 188.7151 298.0298 159.7812 326.9637
## Sep 2015 249.1179 191.2914 306.9444 160.6798 337.5559
## Oct 2015 262.5825 199.6560 325.5089 166.3448 358.8201
## Nov 2015 277.8258 209.1632 346.4884 172.8154 382.8361
## Dec 2015 396.2347 295.3451 497.1243 241.9374 550.5321
autoplot(myts) +
autolayer(myts_hw, PI=F, series='Multiplicative') +
autolayer(myts_hwd, PI=F, series='Multiplicative with damped trend') +
theme(legend.position = "top") +
ylab("Retail Sales")
Seeing the forecast, it appears multiplicative damped forecast the trend increases slowly as compared to only multiplicative one.
Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
<- data.frame(
rmse_retdf Method = c("Mulitplicative", "Damped Mulitplicative"),
RMSE = c(accuracy(myts_hw)[2],
accuracy(myts_hwd)[2])
)
rmse_retdf
## Method RMSE
## 1 Mulitplicative 5.916807
## 2 Damped Mulitplicative 5.628031
Comparing RMSEs for both these methods, it is apparent that multiplicative with damped method is better than multiplicative only.
Check that the residuals from the best method look like white noise.
checkresiduals(myts_hw)
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method
## Q* = 86.468, df = 8, p-value = 2.442e-15
##
## Model df: 16. Total lags used: 24
For white noise series, we expect each autocorrelation to be close to zero. If one or more large spikes are outside these bounds, or if substantially more than 5% of spikes are outside these bounds, then the series is probably not white noise. Ljung-Box test result and ACF plot show that the residuals aren’t white noise.
Now find the test set RMSE while training the model to the end of 2010. Can you beat the seasonal naive approach from Exercise 8 in Section 3.7?
<- window(myts, end=c(2010, 12))
myts_train <- window(myts, start = 2011)
myts_test
<- hw(myts_train, h=36, seasonal = "multiplicative")
myts_train_hw <- hw(myts_train, h=36, seasonal = "multiplicative", damped = T)
myts_train_hwd <- snaive(myts_train, h=36)
myts_train_sn
<- autoplot(myts_train) +
ap autolayer(myts_train_hw,PI=FALSE, series='Multiplicative') +
autolayer(myts_train_hwd, PI=FALSE, series='Multiplicative with damped trend') +
autolayer(myts_train_sn, PI=FALSE, series='Seasonal Naive') +
autolayer(myts_test, PI=FALSE, series='Test set') +
theme(legend.position = "top") +
ylab("Retail Sales")
## Warning: Ignoring unknown parameters: PI
ap
+ labs(title = "Zoom in - 2011 to 2014") + xlim(c(2011,2014)) ap
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
# RMSE- Holt-Winters’ multiplicative method
accuracy(myts_train_hw, myts_test)[,2]
## Training set Test set
## 5.36735 11.45707
# RMSE- Holt-Winters’ multiplicative method with damped trend
accuracy(myts_train_hwd, myts_test)[,2]
## Training set Test set
## 5.390419 9.566560
# RMSE- Seasonal Naive
accuracy(myts_train_sn, myts_test)[,2]
## Training set Test set
## 12.27525 32.04685
Seeing the RMSEs for training and test sets, it seems Holt-Winters’ multiplicative method with damping does fit the timeseries best among all. Therfore Holt-Winters’ seems far more better than Seasonal Naive.
For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
stlf() function is used to build the model that accepts time series, apply box cox transformation with ETS model.
# STL and ETS
# ets to use for forecasting the seasonally adjusted series
# ZZN - N=none, A=additive, M=multiplicative and Z=automatically
# lambda=auto - Box-Cox transformation parameter
<- stlf(myts_train,
myts_t_stlets h=36,
method="ets",
etsmodel="ZZN",
lambda = "auto",
allow.multiplicative.trend = TRUE)
<- autoplot(myts_train) +
ap_stl autolayer(myts_t_stlets, PI=FALSE, series='STS and ETL') +
autolayer(myts_train_hw, PI=FALSE, series='Multiplicative') +
autolayer(myts_train_hwd, PI=FALSE, series='Multiplicative with damped trend') +
autolayer(myts_test, PI=FALSE, series='Test set') +
theme(legend.position = "top") +
ylab("Retail Sales")
## Warning: Ignoring unknown parameters: PI
ap_stl
# zoom in
+ labs(title = "Zoom in - 2011 to 2014") + xlim(c(2011,2014)) ap_stl
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
# RMSE- stl ets
accuracy(myts_t_stlets, myts_test)[,2]
## Training set Test set
## 4.955934 17.936879
# RMSE- Holt-Winters’ multiplicative method with damped trend
accuracy(myts_train_hwd, myts_test)[,2]
## Training set Test set
## 5.390419 9.566560
# RMSE- Holt-Winters’ multiplicative method
accuracy(myts_train_hw, myts_test)[,2]
## Training set Test set
## 5.36735 11.45707
From the results, it is clear it seems Holt-Winters’ multiplicative method with damping still fits the timeseries better than STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. RMSE difference shows the same results.