library(kableExtra)
library(corrplot)
library(fpp2)
library(plotly)
library(gridExtra)
library(readxl)
Consider the pigs series — the number of pigs slaughtered in Victoria each month.
Monthly total number of pigs slaughtered in Victoria, Australia (Jan 1980 – Aug 1995).
#using library(fpp2)
#help("pigs")
head(pigs)
## Jan Feb Mar Apr May Jun
## 1980 76378 71947 33873 96428 105084 95741
summary(pigs)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 33873 79080 91662 90640 101493 120184
#tsdisplay(pigs)
paste0("Frequency: ", frequency(pigs))
## [1] "Frequency: 12"
autoplot(pigs)
ggseasonplot(pigs)
ggsubseriesplot(pigs)
gglagplot(pigs)
Acf(pigs, lag.max=150)
While we do see some presence of some cyclic components, it does not seems the data has any seasonality or trending components.
Use the ses() function in R to find the optimal values of α and ℓ0 , and generate forecasts for the next four months.
Using SES (simple exponential smoothing) function to explore exponential smoothing in our pigs dataset
ses_pigs_4 <- ses(pigs, h = 4)
summary(ses_pigs_4)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = pigs, h = 4)
##
## Smoothing parameters:
## alpha = 0.2971
##
## Initial states:
## l = 77260.0561
##
## sigma: 10308.58
##
## AIC AICc BIC
## 4462.955 4463.086 4472.665
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Sep 1995 98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995 98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995 98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995 98816.41 83958.37 113674.4 76092.99 121539.8
The use of ses function gives us alpha = 0.2971 (Smoothing parameters). Optimal values: \(\alpha\) = 0.2971 and \(l_{0}\) = 77260.0561.
Compute a 95% prediction interval for the first forecast using \(\hat{y}\) ± 1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.
ses_pigs <- ses(pigs)
autoplot(ses_pigs) +
autolayer(fitted(ses_pigs), series="Fitted")
s <- sd(ses_pigs$residuals)
ub <- round((ses_pigs$mean[1] + 1.96*s), 2)
lb <- round((ses_pigs$mean[1] - 1.96*s), 2)
Interval Bounds:
print(paste0('Lower bound CI : ', lb))
## [1] "Lower bound CI : 78679.97"
print(paste0('Upper bound CI : ', ub))
## [1] "Upper bound CI : 118952.84"
#It is good to know that R already has function that can compute these for you, I'm not going to use them here.
#R_lb <- round(ses(pigs, h = 4, level = 95)$lower[1], 2)
#R_ub <- round(ses(pigs, h = 4, level = 95)$upper[1], 2)
Looking at the values and faorecasts plotted, we can see that these results are little different than the results computed by the ses() function.
Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.
Daily sales of paperback and hardcover books at the same store.
#help("books")
data("books")
summary(books)
## Paperback Hardcover
## Min. :111.0 Min. :128.0
## 1st Qu.:167.2 1st Qu.:170.5
## Median :189.0 Median :200.5
## Mean :186.4 Mean :198.8
## 3rd Qu.:207.2 3rd Qu.:222.0
## Max. :247.0 Max. :283.0
head(books)
## Time Series:
## Start = 1
## End = 6
## Frequency = 1
## Paperback Hardcover
## 1 199 139
## 2 172 128
## 3 111 172
## 4 209 139
## 5 161 191
## 6 119 168
Plot the series and discuss the main features of the data
autoplot(books) + ggtitle("Daily Book Sales") + xlab("Day") + ylab("Books Sales")
books[,"Paperback"]
## Time Series:
## Start = 1
## End = 30
## Frequency = 1
## [1] 199 172 111 209 161 119 195 195 131 183 143 141 168 201 155 243 225 167 237
## [20] 202 186 176 232 195 190 182 222 217 188 247
autoplot(books[,"Paperback"]) + ggtitle("Plot of Paperback")
gglagplot(books[,"Paperback"])
Acf(books[,"Paperback"], lag.max=150)
Paperbacks don’t seem to have cyclic or seasonal components to them. Paperbacks seems to be upward trending.The ACF points are within the dotted blue lines suggesting that there is unlikely to be correlations within the autocorrelations.
books[,"Hardcover"]
## Time Series:
## Start = 1
## End = 30
## Frequency = 1
## [1] 139 128 172 139 191 168 170 145 184 135 218 198 230 222 206 240 189 222 158
## [20] 178 217 261 238 240 214 200 201 283 220 259
autoplot(books[,"Hardcover"]) + ggtitle("Hardcover Plot")
gglagplot(books[,"Hardcover"])
Acf(books[,"Hardcover"], lag.max=150)
Haradcovers do not have any seasonal or cyclical components to them either. There seem to have upward trending.
Use the ses() function to forecast each series, and plot the forecasts.
ses.paperback <- ses(books[, "Paperback"])
ses.hardcover <- ses(books[, "Hardcover"])
autoplot(books[, "Paperback"], series = "Paperback") +
autolayer(ses.paperback, series = "Paperback") +
autolayer(books[, "Hardcover"], series = "Hardcover") +
autolayer(ses.hardcover, series = "Hardcover", PI = FALSE) +
ylab("Sales Amount") +
ggtitle("SES Books Sales")
Compute the RMSE values for the training data in each case
# RMSE for paperback
ses.paperback.rmse <- sqrt(mean(ses.paperback$residuals^2))
ses.paperback.rmse
## [1] 33.63769
# RMSE for hardcover
ses.hardcover.rmse <- sqrt(mean(ses.hardcover$residuals^2))
ses.hardcover.rmse
## [1] 31.93101
RMSE: hardcover(sales) < paperback(sales) Paperback variance residuals is greater than that of hardcover
Use same book dataset from above question:
Now apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.
holt.paperback <- holt(books[, "Paperback"], h = 4)
holt.hardcover <- holt(books[, "Hardcover"], h = 4)
autoplot(books[, "Paperback"], series = "Paperback") +
autolayer(holt.paperback, series = "Paperback") +
autolayer(books[, "Hardcover"], series = "Hardcover") +
autolayer(holt.hardcover, series = "Hardcover", PI = FALSE) +
ylab("Sales Amount") +
ggtitle("Holt: Books Sales")
Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.
holt.paperback.rmse <- sqrt(mean(holt.paperback$residuals^2))
holt.paperback.rmse
## [1] 31.13692
holt.hardcover.rmse <- sqrt(mean(holt.hardcover$residuals^2))
holt.hardcover.rmse
## [1] 27.19358
RMSE Holt value is lower than that of ses() method. Because it has low RMSE values, Holt’s method may be more suitable for more linear datasets.
Compare the forecasts for the two series using both methods. Which do you think is best?
Holt’s method appear to be better choice for more linear dataset as it has low RMSE values. But for better results, one must split the dataset into training and test datasets using cross validation techniques.
Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.
forecast(ses.paperback, h=1)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 207.1097 162.4882 251.7311 138.867 275.3523
forecast(holt.paperback, h=1)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 209.4668 166.6035 252.3301 143.913 275.0205
The 95% confidence interval for SES Paperback: 138.86 - 275.35
The 95% confidence interval for Holt Paperback: 143.91 - 275.02
The 95% confidence interval is more narrow in the Holt method compared to the SES.
forecast(ses.hardcover, h=1)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 239.5601 197.2026 281.9176 174.7799 304.3403
forecast(holt.hardcover, h=1)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 250.1739 212.739 287.6087 192.9222 307.4256
The 95% confidence interval for SES Hardcover: 174.77 - 304.34
The 95% confidence interval for Holt Hardcover: 192.92 - 307.42
The 95% confidence interval is more narrow in the Holt method compared to the SES.
#SES method
print(paste0("SES method of interval: ", "(", ses(books[, "Hardcover"], level = 95, h = 4)$lower[1], ", ", ses(books[, "Hardcover"], level = 95, h = 4)$upper[1], ")"))
## [1] "SES method of interval: (174.779870851487, 304.340313152036)"
#Holt's method
print(paste0("SES method of interval: ", "(", holt(books[, "Hardcover"], level = 95, h = 4)$lower[1], ", ", holt(books[, "Hardcover"], level = 95, h = 4)$upper[1], ")"))
## [1] "SES method of interval: (192.922170771941, 307.42557365277)"
For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts. [Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]
data("eggs")
## Warning in data("eggs"): data set 'eggs' not found
head(eggs)
## Time Series:
## Start = 1900
## End = 1905
## Frequency = 1
## [1] 276.79 315.42 314.87 321.25 314.54 317.92
summary(eggs)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 62.27 148.87 209.15 206.15 276.71 358.78
autoplot(eggs)
gglagplot(eggs)
Acf(eggs)
#RMSE Holt function
holt.eggs <- holt(eggs, h = 100)
autoplot(holt.eggs) + autolayer(holt.eggs$fitted)
#RMSE Holt function with damped option
holt.damped.eggs <- holt(eggs, damped = TRUE, h = 100)
autoplot(holt.damped.eggs) + autolayer(holt.damped.eggs$fitted)
#RMSE Holt function with Box-Cox transformation
holt.BoxCox.eggs <- holt(eggs, lambda = BoxCox.lambda(eggs), h = 100)
autoplot(holt.BoxCox.eggs) + autolayer(holt.BoxCox.eggs$fitted)
#RMSE Holt function with damped option and Box-Cox transformation
holt.BoxCox.damped.eggs <- holt(eggs, damped = TRUE, lambda = BoxCox.lambda(eggs), h = 100)
autoplot(holt.BoxCox.damped.eggs) + autolayer(holt.BoxCox.damped.eggs$fitted)
paste0("RMSE Holt function = ", sqrt(mean(holt.eggs$residuals^2)))
## [1] "RMSE Holt function = 26.5821881569903"
paste0("RMSE Holt function with damped option = ", sqrt(mean(holt.damped.eggs$residuals^2)))
## [1] "RMSE Holt function with damped option = 26.5401852798325"
paste0("RMSE Holt function with Box-Cox transformation = ", sqrt(mean(holt.BoxCox.eggs$residuals^2)))
## [1] "RMSE Holt function with Box-Cox transformation = 1.03221710764543"
paste0("RMSE Holt function with damped option and Box-Cox transformation = ", sqrt(mean(holt.BoxCox.damped.eggs$residuals^2)))
## [1] "RMSE Holt function with damped option and Box-Cox transformation = 1.03918723939767"
Recall your retail time series data (from Exercise 3 in Section 2.10).
Why is multiplicative seasonality necessary for this series?
retaildata = readxl::read_excel("retail.xlsx", skip = 1)
myts = ts(retaildata[, 15], frequency = 12, start = c(1982,4))
autoplot(myts) + ggtitle("Retail Data Time Series")
autoplot(decompose(myts,type="multiplicative"))
Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
holt.fit <- hw(myts, seasonal="multiplicative",damped=FALSE)
summary(holt.fit)
##
## Forecast method: Holt-Winters' multiplicative method
##
## Model Information:
## Holt-Winters' multiplicative method
##
## Call:
## hw(y = myts, seasonal = "multiplicative", damped = FALSE)
##
## Smoothing parameters:
## alpha = 0.4775
## beta = 1e-04
## gamma = 0.3913
##
## Initial states:
## l = 23.21
## b = 0.3815
## s = 0.8539 0.7454 0.7508 1.3732 1.0659 1.0346
## 1.0921 1.0709 1.067 1.0163 1.0322 0.8977
##
## sigma: 0.0621
##
## AIC AICc BIC
## 3271.564 3273.250 3338.591
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.09292851 4.512297 3.198521 -0.7408265 4.915305 0.5801485
## ACF1
## Training set 0.08651428
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 2014 132.5906 122.04284 143.1383 116.45922 148.7219
## Feb 2014 101.4648 92.52133 110.4082 87.78695 115.1426
## Mar 2014 109.2479 98.76688 119.7289 93.21858 125.2772
## Apr 2014 118.6365 106.40320 130.8698 99.92728 137.3457
## May 2014 107.8318 95.99198 119.6715 89.72438 125.9391
## Jun 2014 105.3891 93.15637 117.6218 86.68076 124.0974
## Jul 2014 115.2966 101.23049 129.3628 93.78432 136.8090
## Aug 2014 119.9092 104.60524 135.2132 96.50380 143.3147
## Sep 2014 125.8771 109.13540 142.6189 100.27287 151.4814
## Oct 2014 138.2408 119.14392 157.3378 109.03463 167.4471
## Nov 2014 161.6416 138.51387 184.7693 126.27080 197.0123
## Dec 2014 243.6270 207.61046 279.6435 188.54447 298.7095
## Jan 2015 137.1147 114.77929 159.4500 102.95567 171.2736
## Feb 2015 104.9174 87.38951 122.4453 78.11079 131.7241
## Mar 2015 112.9553 93.62694 132.2837 83.39511 142.5155
## Apr 2015 122.6517 101.18051 144.1228 89.81438 155.4889
## May 2015 111.4714 91.52927 131.4135 80.97254 141.9703
## Jun 2015 108.9368 89.03994 128.8336 78.50720 139.3663
## Jul 2015 119.1674 96.96616 141.3686 85.21353 153.1213
## Aug 2015 123.9241 100.39379 147.4544 87.93759 159.9106
## Sep 2015 130.0806 104.92668 155.2345 91.61101 168.5502
## Oct 2015 142.8449 114.73416 170.9557 99.85323 185.8366
## Nov 2015 167.0107 133.58489 200.4366 115.89032 218.1311
## Dec 2015 251.6980 200.49641 302.8996 173.39193 330.0041
autoplot(holt.fit) + ggtitle("Holt Winter Multiplicative Fit ")
holt.fit.damped <- hw(myts, seasonal="multiplicative",damped=TRUE)
summary(holt.fit.damped)
##
## Forecast method: Damped Holt-Winters' multiplicative method
##
## Model Information:
## Damped Holt-Winters' multiplicative method
##
## Call:
## hw(y = myts, seasonal = "multiplicative", damped = TRUE)
##
## Smoothing parameters:
## alpha = 0.508
## beta = 0.004
## gamma = 0.3974
## phi = 0.98
##
## Initial states:
## l = 22.937
## b = -0.0448
## s = 0.867 0.7254 0.7038 1.3907 1.0724 1.0083
## 1.0667 1.0868 1.0984 1.0545 0.9898 0.9363
##
## sigma: 0.0651
##
## AIC AICc BIC
## 3299.520 3301.410 3370.491
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.4807647 4.561898 3.222506 0.4081738 4.91363 0.584499 0.06759754
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 2014 132.9199 121.83740 144.0025 115.97067 149.8692
## Feb 2014 101.1642 91.68679 110.6417 86.66973 115.6587
## Mar 2014 108.6758 97.47294 119.8786 91.54253 125.8090
## Apr 2014 118.2424 105.02197 131.4629 98.02347 138.4614
## May 2014 107.2524 94.38195 120.1227 87.56878 126.9359
## Jun 2014 104.7407 91.35856 118.1229 84.27447 125.2070
## Jul 2014 114.7360 99.22687 130.2450 91.01685 138.4551
## Aug 2014 119.3381 102.35889 136.3173 93.37064 145.3055
## Sep 2014 124.8242 106.20970 143.4386 96.35581 153.2925
## Oct 2014 137.0401 115.69649 158.3837 104.39786 169.6823
## Nov 2014 159.9474 134.00861 185.8861 120.27745 199.6173
## Dec 2014 239.8288 199.43737 280.2202 178.05545 301.6021
## Jan 2015 135.0787 109.94932 160.2080 96.64665 173.5107
## Feb 2015 102.7741 83.07855 122.4696 72.65237 132.8958
## Mar 2015 110.3703 88.61183 132.1289 77.09357 143.6471
## Apr 2015 120.0492 95.73342 144.3650 82.86142 157.2370
## May 2015 108.8584 86.23014 131.4866 74.25146 143.4653
## Jun 2015 106.2778 83.62922 128.9264 71.63977 140.9159
## Jul 2015 116.3862 90.98255 141.7899 77.53465 155.2378
## Aug 2015 121.0205 93.98905 148.0519 79.67947 162.3615
## Sep 2015 126.5490 97.64680 155.4513 82.34688 170.7512
## Oct 2015 138.8964 106.48491 171.3078 89.32732 188.4654
## Nov 2015 162.0713 123.45728 200.6852 103.01627 221.1262
## Dec 2015 242.9508 183.88942 302.0122 152.62422 333.2774
autoplot(holt.fit.damped) + ggtitle("Holt Winter Multiplicative Fit with Damped Option")
When comparing RMSE, AIC, BIC, the two different methods are not very different. The damped method may provide a trivial improvement over the non-damped method.
Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
error.retail <- tsCV(myts, hw, h = 1, seasonal = "multiplicative")
error.retail.damped <- tsCV(myts, hw, h = 1, seasonal = "multiplicative", damped = TRUE)
sqrt(mean(error.retail^2, na.rm = TRUE))
## [1] 4.805684
sqrt(mean(error.retail.damped^2, na.rm = TRUE))
## [1] 4.965358
RMSE values alomst same. Damped model is recommended because it will prohibit the limitless increase of sales forecast.
Check that the residuals from the best method look like white noise.
checkresiduals(holt.fit.damped)
##
## Ljung-Box test
##
## data: Residuals from Damped Holt-Winters' multiplicative method
## Q* = 75.911, df = 7, p-value = 9.37e-14
##
## Model df: 17. Total lags used: 24
Ljung-Box test result and ACF plot show that the residuals aren’t white noise.
Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?
myts.train <- window(myts, end = c(2010, 12))
myts.test <- window(myts, start = 2011)
myts.train.damped <- hw(myts.train, h = 36, seasonal = "multiplicative", damped = TRUE)
autoplot(myts.train.damped)
accuracy(myts.train.damped, myts.test)
## ME RMSE MAE MPE MAPE MASE
## Training set 0.3426235 3.788172 2.760286 0.1965206 4.619317 0.5636346
## Test set 3.3249030 12.332173 9.224458 1.8191543 8.033174 1.8835814
## ACF1 Theil's U
## Training set 0.1406912 NA
## Test set 0.4340217 0.4901811
myts.train.nd <- hw(myts.train, h = 36, seasonal = "multiplicative")
autoplot(myts.train.nd)
accuracy(myts.train.nd, myts.test)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.08201523 3.744601 2.729380 -0.5747145 4.609611 0.5573237
## Test set -1.26833947 10.870610 8.061112 -2.1884614 7.449230 1.6460328
## ACF1 Theil's U
## Training set 0.1427398 NA
## Test set 0.2723417 0.4721114
Could not beat seasonal naive approach with Holt-Winters damped option. Better accuracy received with damped option but still couldn’t beat the seasonal naive approach.
For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
# Look for lambda
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:gridExtra':
##
## combine
## The following object is masked from 'package:kableExtra':
##
## group_rows
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
lambda <- BoxCox.lambda(myts.train)
# Box-Cox-STLF transformation
stlf_boxcox_myts.train <- stlf(myts.train, lambda = BoxCox.lambda(myts.train))
#ETS transformation on seasonally
ets_myts.train <- ets(seasadj(decompose(myts.train, "multiplicative")))
Compare them
# Box-Cox-STLF transformation
accuracy(stlf_boxcox_myts.train, myts.test)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.029466 2.993223 2.212575 -0.107248 3.677062 0.4517952
## Test set -10.727124 16.213436 11.932970 -10.811284 11.844987 2.4366439
## ACF1 Theil's U
## Training set 0.08742471 NA
## Test set 0.25644648 0.7557907
# ETS transformation
accuracy(stlf_boxcox_myts.train, myts.test)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.029466 2.993223 2.212575 -0.107248 3.677062 0.4517952
## Test set -10.727124 16.213436 11.932970 -10.811284 11.844987 2.4366439
## ACF1 Theil's U
## Training set 0.08742471 NA
## Test set 0.25644648 0.7557907
Obviously, Accuracy of RMSE of Box-Cox-STLF transformation outperforms ETS transformation. In any case, we see lower RMSE and not much better than the previous forcasting models.