Question 1

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books (data set books).

A

Plot the series and discuss the main features of the data.

paper_ts <- ts(books[,1], frequency = 7)
hard_ts <- ts(books[,2], frequency = 7)
autoplot(books, facets = TRUE, xlab = "Days") + theme_minimal()

Let’s cover the three basic elements that can exist within a time series: trend, level, and seasonality. Every time series contains level and both hardcover and paperback series are no exception. Both also seem to contain an upward trend. Let’s look at them more in-depth using the decompose function to provide us a snapshot of their elements.

autoplot(decompose(paper_ts)) + theme_few()

It would appear that our initial observation of the upward trend were correct. The third facet of the autoplot above reveals the presence of a moderately strong upward trend. We initialized the time-series with a frequency of 7 to capture any weekly seasonality. A frequency of 7 was chosen as we do not have enough observations to establish a monthly period and the default period of one would not allow any type of decomposition to occur.

From what is present in the decomposed model, it would appear that weekly seasonality exists within the limited data collected. Four discernable periods of relatively similar patterns are evident within the 2nd facet of the decomposed plot.

autoplot(decompose(hard_ts)) + theme_few()

The same observations made in the paperback time-series also can be seen in the above hardcover decomposed time-series plot. Again we can see a little over 4 discernable weekly cycles within the seasonal information and evidence of a strong/consistent upward trend.

B

Use simple exponential smoothing with the ses function (setting initial = “simple”) and explore different values of \(\alpha\) for the paperback series. Record the within-sample SSE for the one-step forecasts. Plot SSE against \(\alpha\) and find which value of \(\alpha\) works best. What is the effect of \(\alpha\) on the forecasts?

To evaluate the SSEs associated with different \(\alpha\) values we loop through \(\alpha\) ranging from .01 to 1.00 and plot/tabulate the data for observation.

t_alpha <- .01
ses_table <- data.frame(alpha = numeric(), SSE = numeric())

while (t_alpha <= 1) {

t_ses <- ses(paper_ts, initial = "simple", alpha = t_alpha, h = 4)
t_SSE <- t_ses$model$SSE
ses_table <- rbind(ses_table, c(t_alpha, t_SSE))

t_alpha = t_alpha + .01
}

colnames(ses_table) <- c("alpha", "SSE")
ggplot(ses_table, aes(alpha, SSE)) + geom_point() + theme_igray()

kable(ses_table)
alpha SSE
0.01 40522.05
0.02 40095.14
0.03 39791.99
0.04 39520.69
0.05 39244.90
0.06 38955.93
0.07 38657.42
0.08 38357.36
0.09 38064.21
0.10 37785.20
0.11 37525.81
0.12 37289.76
0.13 37079.28
0.14 36895.45
0.15 36738.44
0.16 36607.82
0.17 36502.75
0.18 36422.14
0.19 36364.76
0.20 36329.34
0.21 36314.59
0.22 36319.26
0.23 36342.18
0.24 36382.24
0.25 36438.42
0.26 36509.76
0.27 36595.42
0.28 36694.60
0.29 36806.59
0.30 36930.75
0.31 37066.49
0.32 37213.30
0.33 37370.70
0.34 37538.27
0.35 37715.63
0.36 37902.43
0.37 38098.36
0.38 38303.17
0.39 38516.58
0.40 38738.40
0.41 38968.41
0.42 39206.45
0.43 39452.35
0.44 39705.97
0.45 39967.19
0.46 40235.90
0.47 40511.99
0.48 40795.37
0.49 41085.96
0.50 41383.70
0.51 41688.51
0.52 42000.35
0.53 42319.15
0.54 42644.88
0.55 42977.49
0.56 43316.95
0.57 43663.22
0.58 44016.27
0.59 44376.08
0.60 44742.62
0.61 45115.86
0.62 45495.79
0.63 45882.38
0.64 46275.62
0.65 46675.48
0.66 47081.95
0.67 47495.01
0.68 47914.65
0.69 48340.83
0.70 48773.56
0.71 49212.80
0.72 49658.54
0.73 50110.77
0.74 50569.45
0.75 51034.58
0.76 51506.14
0.77 51984.09
0.78 52468.42
0.79 52959.12
0.80 53456.14
0.81 53959.48
0.82 54469.10
0.83 54984.99
0.84 55507.11
0.85 56035.45
0.86 56569.98
0.87 57110.66
0.88 57657.49
0.89 58210.42
0.90 58769.45
0.91 59334.53
0.92 59905.66
0.93 60482.81
0.94 61065.96
0.95 61655.09
0.96 62250.19
0.97 62851.23
0.98 63458.22
0.99 64071.15

It appears that the optimal \(\alpha\) value is right around .21 if measured by the reduction in the SSE.

C

Now let ses select the optimal value of \(\alpha\). Use this value to generate forecasts for the next four days. Compare your results with 2.

paper_opt_ses1 <- ses(paper_ts, initial = "simple", h = 4)
summary(paper_opt_ses1)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = paper_ts, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.2125 
## 
##   Initial states:
##     l = 199 
## 
##   sigma:  34.7918
## Error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 1.749509 34.79175 28.64424 -2.770157 16.56938 0.7852414
##                    ACF1
## Training set -0.1268119
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       210.1537 165.5663 254.7411 141.9631 278.3443
## 5.428571       210.1537 164.5706 255.7368 140.4404 279.8671
## 5.571429       210.1537 163.5962 256.7112 138.9501 281.3573
## 5.714286       210.1537 162.6418 257.6657 137.4905 282.8170

It would appear that our estimate of .21 as the optimal \(\alpha\) value from the minimization of the SSE was quite close. The \(\alpha\) value chosen by the ses model is .2125, very close to what we discerned by eye in part b

D

Repeat but with initial=“optimal”. How much difference does an optimal initial level make?

paper_opt_ses2 <- ses(paper_ts, initial = "optimal", h = 4)
summary(paper_opt_ses2)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = paper_ts, h = 4, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8257 
## 
##   sigma:  33.6377
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783 
## 
## Error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.176212 33.63769 27.8431 0.4737524 15.57782 0.7632792
##                    ACF1
## Training set -0.2117579
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       207.1098 164.0013 250.2182 141.1811 273.0384
## 5.428571       207.1098 163.3934 250.8261 140.2513 273.9682
## 5.571429       207.1098 162.7937 251.4258 139.3342 274.8853
## 5.714286       207.1098 162.2021 252.0174 138.4294 275.7901

It would appear that setting an optimal level can have quite an effect on the optimal \(\alpha\) value. Here the model choses a value of .1685 which is ~20% lower than the \(\alpha\) in part c.

E

t_alpha <- .01
ses_table <- data.frame(alpha = numeric(), SSE = numeric())

while (t_alpha <= 1) {

t_ses <- ses(hard_ts, initial = "simple", alpha = t_alpha, h = 4)
t_SSE <- t_ses$model$SSE
ses_table <- rbind(ses_table, c(t_alpha, t_SSE))

t_alpha = t_alpha + .01
}

colnames(ses_table) <- c("alpha", "SSE")
ggplot(ses_table, aes(alpha, SSE)) + geom_point() + theme_igray()

kable(ses_table)
alpha SSE
0.01 127760.06
0.02 107451.62
0.03 91892.14
0.04 79864.26
0.05 70483.46
0.06 63102.94
0.07 57246.54
0.08 52561.26
0.09 48783.47
0.10 45714.82
0.11 43204.95
0.12 41138.96
0.13 39428.42
0.14 38004.70
0.15 36814.18
0.16 35814.60
0.17 34972.47
0.18 34261.02
0.19 33658.73
0.20 33148.16
0.21 32715.13
0.22 32347.98
0.23 32037.13
0.24 31774.63
0.25 31553.85
0.26 31369.26
0.27 31216.20
0.28 31090.72
0.29 30989.49
0.30 30909.69
0.31 30848.88
0.32 30805.01
0.33 30776.30
0.34 30761.22
0.35 30758.47
0.36 30766.93
0.37 30785.61
0.38 30813.69
0.39 30850.45
0.40 30895.27
0.41 30947.63
0.42 31007.07
0.43 31073.21
0.44 31145.72
0.45 31224.35
0.46 31308.86
0.47 31399.07
0.48 31494.83
0.49 31596.04
0.50 31702.60
0.51 31814.46
0.52 31931.59
0.53 32053.96
0.54 32181.59
0.55 32314.49
0.56 32452.71
0.57 32596.29
0.58 32745.30
0.59 32899.82
0.60 33059.93
0.61 33225.72
0.62 33397.31
0.63 33574.81
0.64 33758.34
0.65 33948.03
0.66 34144.02
0.67 34346.45
0.68 34555.48
0.69 34771.26
0.70 34993.95
0.71 35223.73
0.72 35460.78
0.73 35705.26
0.74 35957.39
0.75 36217.34
0.76 36485.32
0.77 36761.53
0.78 37046.20
0.79 37339.55
0.80 37641.79
0.81 37953.18
0.82 38273.94
0.83 38604.33
0.84 38944.62
0.85 39295.05
0.86 39655.92
0.87 40027.50
0.88 40410.10
0.89 40804.00
0.90 41209.53
0.91 41627.02
0.92 42056.79
0.93 42499.20
0.94 42954.61
0.95 43423.39
0.96 43905.93
0.97 44402.64
0.98 44913.93
0.99 45440.23

When we run the hardcover series we see a much different plot manifest between different alpha values and their corresponding SSE levels. Here we can eyeball that the optimal value seems to manifest around .35 for \(\alpha\).

hard_opt_ses1 <- ses(hard_ts, initial = "simple", h = 4)
summary(hard_opt_ses1)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hard_ts, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.3473 
## 
##   Initial states:
##     l = 139 
## 
##   sigma:  32.0198
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 9.729514 32.01982 26.34467 3.104208 13.05063 0.6885538
##                    ACF1
## Training set -0.1629043
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       240.3808 199.3457 281.4158 177.6231 303.1385
## 5.428571       240.3808 196.9410 283.8206 173.9453 306.8162
## 5.571429       240.3808 194.6625 286.0990 170.4608 310.3008
## 5.714286       240.3808 192.4924 288.2692 167.1418 313.6197

Running the simple optimized ses function validates are decision by yielding an \(\alpha\) value of .3473 which is almost identical to the value we identified.

hard_opt_ses2 <- ses(hard_ts, initial = "optimal", h = 4)
summary(hard_opt_ses2)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hard_ts, h = 4, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2836 
## 
##   sigma:  31.931
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542 
## 
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 9.166918 31.93101 26.7731 2.636328 13.39479 0.6997513
##                    ACF1
## Training set -0.1417817
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       239.5602 198.6390 280.4815 176.9766 302.1439
## 5.428571       239.5602 196.4905 282.6299 173.6908 305.4297
## 5.571429       239.5602 194.4443 284.6762 170.5613 308.5591
## 5.714286       239.5602 192.4869 286.6336 167.5677 311.5527

Once we introduced an optimized level to the model we again see a shift in the most desirable level of \(\alpha\), which shifts downward to .3283

Question 2

Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

A

Compare the SSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. Discuss the merits of the two forecasting methods for these data sets.

First we create the two holt models for both the paperback and hardcover series.

paper_holt <- holt(paper_ts, initial = "simple", h = 4)
summary(paper_holt)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = paper_ts, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.2984 
##     beta  = 0.4984 
## 
##   Initial states:
##     l = 199 
##     b = -27 
## 
##   sigma:  39.5463
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 7.769845 39.54634 33.5377 1.633307 18.19621 0.9193886
##                   ACF1
## Training set -0.108868
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80     Lo 95    Hi 95
## 5.285714       222.0201 171.3394 272.7007 144.51067 299.5295
## 5.428571       229.6904 164.8872 294.4935 130.58245 328.7983
## 5.571429       237.3606 145.1175 329.6038  96.28695 378.4343
## 5.714286       245.0309 115.5211 374.5407  46.96280 443.0990
hard_holt <- holt(hard_ts, initial = "simple", h = 4)
summary(hard_holt)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = hard_ts, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.439 
##     beta  = 0.1574 
## 
##   Initial states:
##     l = 139 
##     b = -11 
## 
##   sigma:  35.0438
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE     MASE
## Training set 7.193136 35.04383 27.99173 2.423723 14.18241 0.731602
##                     ACF1
## Training set -0.07743371
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       250.7888 205.8784 295.6993 182.1042 319.4735
## 5.428571       254.7003 202.4087 306.9918 174.7272 334.6733
## 5.571429       258.6117 196.3179 320.9054 163.3416 353.8817
## 5.714286       262.5231 187.9900 337.0563 148.5345 376.5117
paper_holt$model$SSE
## [1] 46917.39
paper_opt_ses1$model$SSE
## [1] 36313.98
sum(residuals(paper_opt_ses2)^2)
## [1] 33944.82

After reviewing the 3 different methods used for the paperback series, it appears that the ses model using optimized \(\alpha\) and level values yielded the lowest SSE (33944.82 vs 36313.98 (Optimized/simple)).

hard_holt$model$SSE
## [1] 36842.1
hard_opt_ses1$model$SSE
## [1] 30758.07
sum(residuals(hard_opt_ses2)^2)
## [1] 30587.69

Again, the hardcover series reveals that the model with the lowest SSE is the ses model utilizing an optimized \(\alpha\) and level with a value of 30587.69.

B

Compare the forecasts for the two series using both methods. Which do you think is best?

The two measures of accuracy we’ll use to answer this question are the RMSE and MAPE of the different models proposed.

accuracy(paper_holt$fitted, paper_ts)
##                ME     RMSE     MAE      MPE     MAPE      ACF1 Theil's U
## Test set 7.769845 39.54634 33.5377 1.633307 18.19621 -0.108868 0.8763663
accuracy(paper_opt_ses1$fitted, paper_ts)
##                ME     RMSE      MAE       MPE     MAPE       ACF1
## Test set 1.749509 34.79175 28.64424 -2.770157 16.56938 -0.1268119
##          Theil's U
## Test set 0.6807692
accuracy(paper_opt_ses2$fitted, paper_ts)
##                ME     RMSE     MAE       MPE     MAPE       ACF1 Theil's U
## Test set 7.176212 33.63769 27.8431 0.4737524 15.57782 -0.2117579 0.6685721

Just like in Part A, we find that the optimized alpha/level model seems the most desirable. This time its the low RMSE and MAPE values that designate it as the best model of the three.

accuracy(hard_holt$fitted, hard_ts)
##                ME     RMSE      MAE      MPE     MAPE        ACF1
## Test set 7.193136 35.04383 27.99173 2.423723 14.18241 -0.07743371
##          Theil's U
## Test set 0.9150579
accuracy(hard_opt_ses1$fitted, hard_ts)
##                ME     RMSE      MAE      MPE     MAPE       ACF1 Theil's U
## Test set 9.729514 32.01982 26.34467 3.104208 13.05063 -0.1629043 0.8142204
accuracy(hard_opt_ses2$fitted, hard_ts)
##                ME     RMSE     MAE      MPE     MAPE       ACF1 Theil's U
## Test set 9.166918 31.93101 26.7731 2.636328 13.39479 -0.1417817 0.8059213

Interestingly enough, the hardcover series introduces some ambiguity into the answer selected in Part A. While the optimized alpha/level model has a lower RMSE value, the MAPE of the optimized alpha/normal level model is the lowest of the three. The accuracy() function delivers multiple measures of time-series fit for these situations. Relying on one measure of accuracy is dangerous when it comes to time-series as no two series will ever be the same. In this situation the Optimized alpha/level model has a lower ACF1 value and by this measure we will settle the tie-breaker to declare the 3rd model the superior fit for the data provided.

C

Calculate a 95% prediction interval for the first forecast for each series using both methods, assuming normal errors. Compare your forecasts with those produced by R.

paper_opt_ses3 <- ses(paper_ts, initial = "optimal", h = 1)
paper_opt_ses2$mean + 1.96*sd(paper_opt_ses2$residuals)
## Time Series:
## Start = c(5, 3) 
## End = c(5, 6) 
## Frequency = 7 
## [1] 272.6229 272.6229 272.6229 272.6229
summary(paper_opt_ses3)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = paper_ts, h = 1, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8257 
## 
##   sigma:  33.6377
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783 
## 
## Error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.176212 33.63769 27.8431 0.4737524 15.57782 0.7632792
##                    ACF1
## Training set -0.2117579
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       207.1098 164.0013 250.2182 141.1811 273.0384

When we use the prediction interval calculation provided by the text (yt +(-) 1.96 ^\(\sigma\)) we come up with a next-forecast prediction interval (upper end) of ~272.62 vs. R’s summary() output of 273.04 for the upper 95% prediction interval.

paper_holt2 <- holt(paper_ts, h = 1, initial = "simple")
paper_holt$mean + 1.96*sd(paper_holt$residuals)
## Time Series:
## Start = c(5, 3) 
## End = c(5, 6) 
## Frequency = 7 
## [1] 299.3194 306.9897 314.6599 322.3302
summary(paper_holt2)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = paper_ts, h = 1, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.2984 
##     beta  = 0.4984 
## 
##   Initial states:
##     l = 199 
##     b = -27 
## 
##   sigma:  39.5463
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 7.769845 39.54634 33.5377 1.633307 18.19621 0.9193886
##                   ACF1
## Training set -0.108868
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       222.0201 171.3394 272.7007 144.5107 299.5295

The 95% confidence interval of the paperback holt method is even closer with a calculated value of 299.32 vs. R value of 299.53.

hard_opt_ses3 <- ses(hard_ts, initial = "optimal", h = 1)
hard_opt_ses2$mean + 1.96*sd(hard_opt_ses2$residuals)
## Time Series:
## Start = c(5, 3) 
## End = c(5, 6) 
## Frequency = 7 
## [1] 300.5354 300.5354 300.5354 300.5354
summary(hard_opt_ses3)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hard_ts, h = 1, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2836 
## 
##   sigma:  31.931
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542 
## 
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 9.166918 31.93101 26.7731 2.636328 13.39479 0.6997513
##                    ACF1
## Training set -0.1417817
## 
## Forecasts:
##          Point Forecast   Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       239.5602 198.639 280.4815 176.9766 302.1439

The hardcover optimal ses model falls a little short with a high calculation of 300.54 vs R value of 302.14.

hard_holt2 <- holt(hard_ts, h = 1, initial = "simple")
hard_holt$mean + 1.96*sd(hard_holt$residuals)
## Time Series:
## Start = c(5, 3) 
## End = c(5, 6) 
## Frequency = 7 
## [1] 319.1614 323.0729 326.9843 330.8957
summary(hard_holt2)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = hard_ts, h = 1, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.439 
##     beta  = 0.1574 
## 
##   Initial states:
##     l = 139 
##     b = -11 
## 
##   sigma:  35.0438
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE     MASE
## Training set 7.193136 35.04383 27.99173 2.423723 14.18241 0.731602
##                     ACF1
## Training set -0.07743371
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       250.7888 205.8784 295.6993 182.1042 319.4735

We find that the upper 95% prediction interval of the hardcover holt method manual calculation is almost exactly the same as the R calculated value (319.16 vs. 319.47).

The extreme proximities of these calculations demonstrate the utilitarian argument for systems such as R, which minimize the amount of effort expended to create valid statistical renderings for many of life’s real-world problems.