#Loading necessary libraries:

library(fpp)
library(forecast)
library(ggplot2)
library(ggfortify)
library(pander)
library(zoo)

Chapter 7: Exponential Smoothing

For this assignment I will be completing the assigned problems from Chapter 7 - Exponential Smoothing in the Forecasting: Principles and Practice online open-access textbook by, Rob J Hyndman and George Athanasopoulos

Problem 1. Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books (data set books).

First I will begin by looking at the data for books from the \(fpp\) package.

str(books)
##  Time-Series [1:30, 1:2] from 1 to 30: 199 172 111 209 161 119 195 195 131 183 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr [1:2] "Paperback" "Hardcover"
summary(books)
##    Paperback       Hardcover    
##  Min.   :111.0   Min.   :128.0  
##  1st Qu.:167.2   1st Qu.:170.5  
##  Median :189.0   Median :200.5  
##  Mean   :186.4   Mean   :198.8  
##  3rd Qu.:207.2   3rd Qu.:222.0  
##  Max.   :247.0   Max.   :283.0
head(books)
## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
##   Paperback Hardcover
## 1       199       139
## 2       172       128
## 3       111       172
## 4       209       139
## 5       161       191
## 6       119       168
tail(books)
## Time Series:
## Start = 25 
## End = 30 
## Frequency = 1 
##    Paperback Hardcover
## 25       190       214
## 26       182       200
## 27       222       201
## 28       217       283
## 29       188       220
## 30       247       259
frequency(books)
## [1] 1
#Create two time series, one for the paperback books and one for the hardcovers books
paper <-ts(books[,1])
hard <-ts(books[,2])

(a) Plot the series and discuss the main features of the data.

Figure 1. Daily Sales for Paperback Books and Hardcover Books

autoplot(books) +xlab("Day") + ylab("Daily Sales ($)")

After plotting the books dataset which includes the time series of daily sales for a month for both paperback and hardcover books, Figure 1 shows us that both sales for paperback and hardcover books seem to be trending upwards over the month. There also seems to be more volatility for paperback at the beginning of the month, although it is hard to tell from this graph. Seasonality may exist but it is not easily discernable from this figure.

Due to the frequency being a 1 for the books dataset, we cannot use the \(decompose()\) function. However, we can try to change the frequency of the time series. We know that the sales is captured daily, so we can use frequency = 7 as the new frequency, this will give us a little over 4 full cycles (weeks) and the dataset does have 30 days. We will plot both decomposition for paperback and hardcover.

Figure 2. Decomposition of Additive Time Series: Daily Sales for Paperback Books (weekly cycles)

books2 <- ts(books, frequency=7)
plot(decompose(books2[ , 1]))

Figure 3. Decomposition of Additive Time Series: Daily Sales for Hardcover Books (weekly cycles)

books2 <- ts(books, frequency=7)
plot(decompose(books2[ , 2]))

Although Figure 2 and Figure 3 are not providing us with a lot of data, only 4 rounds of weekly “cycles” we can see that from the limited data we have, both time series may contain some sort of upward trend and some sort of seasonality within the weekly season. We would likely need more data to be sure what trend and seasonality (if any) exist for the daily sales of books.

(b) Use simple exponential smoothing with the ses function (setting initial=“simple”) and explore different values of ?? for the paperback series. Record the within-sample SSE for the one-step forecasts. Plot SSE against ?? and find which value of ?? works best. What is the effect of ?? on the forecasts?

Figure 4. Plotting Alphas(\(\alpha\)) against SSE Using Simple Exponential Smoothing for Paperback Books

alphas <- numeric()
SSEs <- numeric()

for (i in seq(0,1,0.05)) {
  alphas <- c(alphas, i)
  s <- ses(paper, initial="simple", alpha=i, h=4)
           SSEs <- c(SSEs, s$model$SSE)
}

paperDF <- data.frame(alphas, SSEs)
pander(paperDF)
alphas SSEs
0 41270
0.05 39245
0.1 37785
0.15 36738
0.2 36329
0.25 36438
0.3 36931
0.35 37716
0.4 38738
0.45 39967
0.5 41384
0.55 42977
0.6 44743
0.65 46675
0.7 48774
0.75 51035
0.8 53456
0.85 56035
0.9 58769
0.95 61655
1 64690
plot(paperDF)

The SSE drops until \(\alpha\)=0.2 and then goes back up at a steady rate. This is the \(\alpha\) that is likely to be the best choice, given it yields the lowest SSE. The effect of \(\alpha\) on the forecast shows the rate at which the weights decrease on the observations. The larger \(\alpha\) the greater the adjustment takes place in the next forecast direction of the previous data point. In this case, with an \(\alpha\)=0.2 there is less adjustment than if the \(\alpha\) were larger, which means the series of one-step within-sample forecasts is smoother.

(c) Now let ses select the optimal value of ??. Use this value to generate forecasts for the next four days. Compare your results with 2.

optAlphaPaper <- ses(paper, initial="simple", h=4)
summary(optAlphaPaper)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = paper, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.2125 
## 
##   Initial states:
##     l = 199 
## 
##   sigma:  34.7918
## Error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 1.749509 34.79175 28.64424 -2.770157 16.56938 0.7223331
##                    ACF1
## Training set -0.1268119
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       210.1537 165.5663 254.7411 141.9631 278.3443
## 32       210.1537 164.5706 255.7368 140.4404 279.8671
## 33       210.1537 163.5962 256.7112 138.9501 281.3573
## 34       210.1537 162.6418 257.6657 137.4905 282.8170

When we let the SES select the optimal value for \(\alpha\) we see that the forecast generates \(\alpha\)=0.2125. This is a similar value for \(\alpha\) compared to the previous question. However, it is obvious that our choice of 0.05 intervals for \(\alpha\) were not precise enough to capture the more accurate \(\alpha\).

(d) Repeat but with initial=“optimal”. How much difference does an optimal initial level make?

optAlphaPaper2 <- ses(paper, initial="optimal", h=4)
summary(optAlphaPaper2)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = paper, h = 4, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8257 
## 
##   sigma:  33.6377
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783 
## 
## Error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.176212 33.63769 27.8431 0.4737524 15.57782 0.7021303
##                    ACF1
## Training set -0.2117579
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1098 164.0013 250.2182 141.1811 273.0384
## 32       207.1098 163.3934 250.8261 140.2513 273.9682
## 33       207.1098 162.7937 251.4258 139.3342 274.8853
## 34       207.1098 162.2021 252.0174 138.4294 275.7901

When we let the SES select the optimal value for \(\alpha\) and change the initial to be equal to “optimal” we see that the forecast generates \(\alpha\)=0.1685. It becomes even more clear that our choice of 0.05 intervals for \(\alpha\) were not precise enough to capture the true optimal level of \(\alpha\). We also see that the MAPE has dropped from ~16.57 to ~15.58. The optimal value for \(\alpha\) being selected by SES is better than the simple SES.

(e) Repeat steps (b)-(d) with the hardcover series.

Figure 5. Plotting Alphas (\(\alpha\)) against SSE Using Simple Exponential Smoothing for Hardcover Books

alphas <- numeric()
SSEs <- numeric()

for (i in seq(0,1,0.05)) {
  alphas <- c(alphas, i)
  s <- ses(hard, initial="simple", alpha=i, h=4)
           SSEs <- c(SSEs, s$model$SSE)
}

hardDF <- data.frame(alphas, SSEs)
pander(hardDF)
alphas SSEs
0 154503
0.05 70483
0.1 45715
0.15 36814
0.2 33148
0.25 31554
0.3 30910
0.35 30758
0.4 30895
0.45 31224
0.5 31703
0.55 32314
0.6 33060
0.65 33948
0.7 34994
0.75 36217
0.8 37642
0.85 39295
0.9 41210
0.95 43423
1 45982
plot(hardDF)

The SSE drops until \(\alpha\)=0.35 . and then it is fairly flat from 0.4 to 0.6, when it then begins to increase gradually. Similarly to this question for the paperback books, the \(\alpha\)=0.35 means that the adjustment taking place in the next forecast would be higher in relation to the previous data point, if the \(\alpha\) was larger. However, the smaller the \(\alpha\) the less adjustment that takes place. Comparatively, the hardcover SES does require a larger \(\alpha\) which means there is more adjustment in the direction of the previous data point than there was for the paperback books. Which indicates that the changes in the level are more rapid for the hardcover books compared to the paperback books.

optAlphaHard <- ses(hard, initial="simple", h=4)
summary(optAlphaHard)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hard, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.3473 
## 
##   Initial states:
##     l = 139 
## 
##   sigma:  32.0198
## Error measures:
##                   ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 9.72952 32.01982 26.34467 3.104211 13.05063 0.7860035
##                    ACF1
## Training set -0.1629042
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       240.3808 199.3457 281.4158 177.6231 303.1385
## 32       240.3808 196.9410 283.8206 173.9453 306.8162
## 33       240.3808 194.6625 286.0990 170.4608 310.3008
## 34       240.3808 192.4924 288.2691 167.1418 313.6197

When we let the SES select the optimal value for \(\alpha\) we see that the forecast generates \(\alpha\)=0.3473. This is a similar value for \(\alpha\) compared to the previous question. However, as we saw with the paperback example, it is obvious that our choice of 0.05 intervals for \(\alpha\) were not precise enough to capture the more accurate \(\alpha\).

optAlphaHard2 <- ses(hard, initial="optimal", h=4)
summary(optAlphaHard2)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hard, h = 4, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2836 
## 
##   sigma:  31.931
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542 
## 
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 9.166918 31.93101 26.7731 2.636328 13.39479 0.7987858
##                    ACF1
## Training set -0.1417817
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5602 198.6390 280.4815 176.9766 302.1439
## 32       239.5602 196.4905 282.6299 173.6908 305.4297
## 33       239.5602 194.4443 284.6762 170.5613 308.5591
## 34       239.5602 192.4869 286.6336 167.5677 311.5527

When we let the SES select the optimal value for \(\alpha\) and change the initial to be equal to “optimal” we see that the forecast generates \(\alpha\)=0.3283. It becomes even more clear that our choice of 0.05 intervals for \(\alpha\) were not precise enough to capture the true optimal level of \(\alpha\). We also see that the MAPE has incresed from ~13.05 to ~13.39. The RMSE for “simple” was ~32.02 while the RMSE for the “optimal” is ~31.93.

Problem 2. Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

paperH <- holt(paper, initial="simple", h=4)
summary(paperH)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = paper, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.2984 
##     beta  = 0.4984 
## 
##   Initial states:
##     l = 199 
##     b = -27 
## 
##   sigma:  39.5463
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 7.769844 39.54634 33.5377 1.633306 18.19621 0.8457332
##                    ACF1
## Training set -0.1088681
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80     Lo 95    Hi 95
## 31       222.0201 171.3394 272.7007 144.51068 299.5295
## 32       229.6904 164.8872 294.4935 130.58245 328.7983
## 33       237.3606 145.1175 329.6038  96.28696 378.4343
## 34       245.0309 115.5211 374.5407  46.96280 443.0991
hardH <- holt(hard, initial="simple", h=4)
summary(hardH)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = hard, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.439 
##     beta  = 0.1574 
## 
##   Initial states:
##     l = 139 
##     b = -11 
## 
##   sigma:  35.0438
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 7.193267 35.04383 27.99174 2.423793 14.18241 0.8351445
##                     ACF1
## Training set -0.07743714
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.7889 205.8784 295.6993 182.1042 319.4735
## 32       254.7003 202.4087 306.9918 174.7273 334.6733
## 33       258.6117 196.3181 320.9052 163.3419 353.8815
## 34       262.5231 187.9903 337.0558 148.5350 376.5111

(a) Compare the SSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. Discuss the merits of the two forecasting methods for these data sets.

Paperback Books

SSE from Holt’s method for paperback books.

sum(residuals(paperH)^2)
## [1] 46917.39

SSE from SES using intial=“simple” for paperback books.

optAlphaPaper$model$SSE
## [1] 36313.98
sum(residuals(optAlphaPaper)^2)
## [1] 36313.98

SSE from SES using intial=“optimal” for paperback books.

sum(residuals(optAlphaPaper2)^2)
## [1] 33944.82

When looking for the lowest SSE for paperback books, we see that between the Holt’s method and the SES using initial=“simple” vs. initial=“optimal, the model that presents the lowest SSE is the SES method using initial=”optimal“, where the SSE is 33,944.82.

Hardcover Books

SSE from Holt’s method for hardcover books.

sum(residuals(hardH)^2)
## [1] 36842.1

SSE from SES using intial=“simple” for hardcover books.

optAlphaHard$model$SSE
## [1] 30758.07
sum(residuals(optAlphaHard)^2)
## [1] 30758.07

SSE from SES using intial=“optimal” for hardcover books.

sum(residuals(optAlphaHard2)^2)
## [1] 30587.69

When looking for the lowest SSE for hardcover books, we see that between the Holt’s method and the SES using initial=“simple” vs. initial=“optimal, the model that presents the lowest SSE is, once again, the SES method using initial=”optimal“, where the SSE is 30,587.69.

(b) Compare the forecasts for the two series using both methods. Which do you think is best?

Paperback Books

Figure 6. Plotting SES Simple vs. SES Optimal vs. Holt’s for Paperback Books

autoplot(paper) + xlab("Day") + autolayer(optAlphaPaper, PI=FALSE, series = "SES Simple") + autolayer(optAlphaPaper2, PI=FALSE, series = "SES Optimal") + autolayer(paperH, series = "Holt", PI=FALSE)

Then the accuracy for each forecast model for paperback books should be checked.

#SES Simple
accuracy(optAlphaPaper$fitted, paper)
##                ME     RMSE      MAE       MPE     MAPE       ACF1
## Test set 1.749509 34.79175 28.64424 -2.770157 16.56938 -0.1268119
##          Theil's U
## Test set 0.6807692
#SES Optimal
accuracy(optAlphaPaper2$fitted, paper)
##                ME     RMSE     MAE       MPE     MAPE       ACF1 Theil's U
## Test set 7.176212 33.63769 27.8431 0.4737524 15.57782 -0.2117579 0.6685721
#Holt's Method
accuracy(paperH$fitted, paper)
##                ME     RMSE     MAE      MPE     MAPE       ACF1 Theil's U
## Test set 7.769844 39.54634 33.5377 1.633306 18.19621 -0.1088681 0.8763663

Similarly to providing the lowest SSE, the SES Optimal method provides the lowest MAPE value (at ~15.58) for paperback books. The RMSE and MAE are lowest with the simple expoentenial smoothing that used the parameter initial=“optimal”. However, there are probably better ways to forecast the paperback books series, that would provide even lower error measures.

Hardcover Books

Figure 7. Plotting SES Simple vs. SES Optimal vs. Holt’s for Hardcover Books

autoplot(hard) + xlab("Day") + autolayer(optAlphaHard, PI=FALSE, series = "SES Simple") + autolayer(optAlphaHard2, PI=FALSE, series = "SES Optimal") + autolayer(hardH, series = "Holt", PI=FALSE)

Then the accuracy for each forecast model for paperback books should be checked.

#SES Simple
accuracy(optAlphaHard$fitted, hard)
##               ME     RMSE      MAE      MPE     MAPE       ACF1 Theil's U
## Test set 9.72952 32.01982 26.34467 3.104211 13.05063 -0.1629042 0.8142204
#SES Optimal
accuracy(optAlphaHard2$fitted, hard)
##                ME     RMSE     MAE      MPE     MAPE       ACF1 Theil's U
## Test set 9.166918 31.93101 26.7731 2.636328 13.39479 -0.1417817 0.8059213
#Holt's Method
accuracy(hardH$fitted, hard)
##                ME     RMSE      MAE      MPE     MAPE        ACF1
## Test set 7.193267 35.04383 27.99174 2.423793 14.18241 -0.07743714
##          Theil's U
## Test set 0.9150588

We can see from Figure 7 that the SES with parameter initial=“simple” and SES with parameter initial=“optimal” are providing very similar forecasts. Looking at their performance metrics, we can see that they both are better models to use with the hardcover book series than Holt’s method. The MAPE value for the SES Simple (~13.05) is slightly lower than the SES Optimal (~13.39). However, the RMSE is slightly higher for the SES Simple at ~32.02 vs. the SES Optimal RMSE which is ~31.93. The MAE for the SES Simple is slightly lower than the SES Optimal as well. Additional forecast models should be developed to generate more accurate forecasts.

(c) Calculate a 95% prediction interval for the first forecast for each series using both methods, assuming normal errors. Compare your forecasts with those produced by R.

We will set h=1 since we are looking for the first forecast for each series using both the SES with parameter initial=“optimal” and Holt’s methods. We will assume normal errors and compare the forecasts for both paperback books and hardcover books.

#SES of Paperback series with initial="optimal"

paperSES_Opt <- ses(paper, initial="optimal", h=1)
summary(paperSES_Opt)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = paper, h = 1, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8257 
## 
##   sigma:  33.6377
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783 
## 
## Error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.176212 33.63769 27.8431 0.4737524 15.57782 0.7021303
##                    ACF1
## Training set -0.2117579
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1098 164.0013 250.2182 141.1811 273.0384

Unsurprisingly, for the paperback book series the point forecast for day 31, which is the first forecast in the series, is the same as it was previously calculated at ~207.11. The 95% prediction interval for this forecast point is ~141.18 to ~273.04.

#Holt's Method of Paperback series 
paperHolt <-holt(paper, h=1, initial="simple")
paperHolt
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       222.0201 171.3394 272.7007 144.5107 299.5295

The prediction interval for Holt’s method for paperback books is ~144.51 to ~299.53.

#SES of Hardcover books series with initial="optimal"

hardSES_Opt <- ses(hard, initial="optimal", h=1)
summary(hardSES_Opt)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hard, h = 1, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2836 
## 
##   sigma:  31.931
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542 
## 
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 9.166918 31.93101 26.7731 2.636328 13.39479 0.7987858
##                    ACF1
## Training set -0.1417817
## 
## Forecasts:
##    Point Forecast   Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5602 198.639 280.4815 176.9766 302.1439

Unsurprisingly, for the hardcover books series the point forecast for day 31, which is the first forecast in the series, is the same as it was previously calculated to be at ~239.56. The 95% prediction interval for this forecast point is ~176.98 to ~302.14.

#Holt's Method of Hardcover books series 
hardHolt <-holt(hard, h=1, initial="simple")
hardHolt
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.7889 205.8784 295.6993 182.1042 319.4735

The prediction interval for Holt’s method for hardcover books is ~182.10 to ~319.47.