Due Monday, March 5th, 2018 at 11:59PM: Problems 1 and 2 from Chapter 7 of Forecasting: principles and practice
#Split 'books' into two separate time series: paperback & hardcover
paper <- ts(books[,1])
hard <- ts(books[,2])
yrange = range(books)
plot(c(1, 30), c(0, yrange[2]+10), type="n", main="Daily sales of paperback and hardcover books", xlab="Time", ylab="Sales")
lines(paper, col="red")
lines(hard, col="blue")
legend("bottomright", c("Paperback","Hardcover"), bty="n", lwd=1, col=c("red", "blue"))
books_2 <- ts(books, frequency=7)
plot(decompose(books_2[,1]))
books_2 <- ts(books, frequency=7)
plot(decompose(books_2[,2]))
#First we create two empty vectors that we will be storing results in
alphas <- numeric()
SSEs <- numeric()
#Following Professor Dean's example, we'll run this through a for() loop given the small size of the time series data - 30 points
for(i in seq(0,1,0.05)) {
alphas <- c(alphas, i)
s <- ses(paper, initial="simple", alpha=i, h=4)
SSEs <- c(SSEs, s$model$SSE)
}
#Combine them into a dataframe
paper_DF <- data.frame(alphas, SSEs)
#And plot them
plot(paper_DF, main="Simple Exponential Smoothing on the paperback series")
pander(paper_DF, caption = "Exploring values of α for the paperback series", split.table = Inf)
| alphas | SSEs |
|---|---|
| 0 | 41270 |
| 0.05 | 39245 |
| 0.1 | 37785 |
| 0.15 | 36738 |
| 0.2 | 36329 |
| 0.25 | 36438 |
| 0.3 | 36931 |
| 0.35 | 37716 |
| 0.4 | 38738 |
| 0.45 | 39967 |
| 0.5 | 41384 |
| 0.55 | 42977 |
| 0.6 | 44743 |
| 0.65 | 46675 |
| 0.7 | 48774 |
| 0.75 | 51035 |
| 0.8 | 53456 |
| 0.85 | 56035 |
| 0.9 | 58769 |
| 0.95 | 61655 |
| 1 | 64690 |
alpha_paper_simple <- ses(paper, alpha=NULL, initial="simple", h=4)
summary(alpha_paper_simple)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = paper, h = 4, initial = "simple", alpha = NULL)
##
## Smoothing parameters:
## alpha = 0.2125
##
## Initial states:
## l = 199
##
## sigma: 34.7918
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 1.749509 34.79175 28.64424 -2.770157 16.56938 0.7223331
## ACF1
## Training set -0.1268119
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 210.1537 165.5663 254.7411 141.9631 278.3443
## 32 210.1537 164.5706 255.7368 140.4404 279.8671
## 33 210.1537 163.5962 256.7112 138.9501 281.3573
## 34 210.1537 162.6418 257.6657 137.4905 282.8170
Table 1.6 shows us that the optimal value of α is \(0.2125\) which is relatively close to the \(0.20\) that we identified from Table 1.4 prior. As we were running our calculations using a \(0.05\) increment we simply weren’t granular enough to uncover \(0.2125\) on our own beforehand. That said, we weren’t too far off. alpha_paper_optimal <- ses(paper, alpha=NULL, initial="optimal", h=4)
summary(alpha_paper_optimal)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = paper, h = 4, initial = "optimal", alpha = NULL)
##
## Smoothing parameters:
## alpha = 0.1685
##
## Initial states:
## l = 170.8257
##
## sigma: 33.6377
##
## AIC AICc BIC
## 318.9747 319.8978 323.1783
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 7.176212 33.63769 27.8431 0.4737524 15.57782 0.7021303
## ACF1
## Training set -0.2117579
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 207.1098 164.0013 250.2182 141.1811 273.0384
## 32 207.1098 163.3934 250.8261 140.2513 273.9682
## 33 207.1098 162.7937 251.4258 139.3342 274.8853
## 34 207.1098 162.2021 252.0174 138.4294 275.7901
By changing the initial level equal to optimal, we see that the alpha value has dropped to \(0.1685\) and that the initial state values have declined from the \(199.00\) value in setting initial="simple" to \(170.8257\) when initial="optimal". Because of the observed declines in RMSE, MAPE, and MASE it is evident that the optimal model is a better fit for the paper time series. #First we create two empty vectors that we will be storing results in
alphas <- numeric()
SSEs <- numeric()
#Run through a for() loop
for(i in seq(0,1,0.05)) {
alphas <- c(alphas, i)
s <- ses(hard, initial="simple", alpha=i, h=4)
SSEs <- c(SSEs, s$model$SSE)
}
#Combine them into a dataframe
hard_DF <- data.frame(alphas, SSEs)
#And plot them
plot(hard_DF, main="Simple Exponential Smoothing on the hardcover series")
paper series. To confirm that lowest alpha point, Table 1.9 below includes the raw alpha values. pander(hard_DF, caption = "Exploring values of α for the hardcover series", split.table = Inf)
| alphas | SSEs |
|---|---|
| 0 | 154503 |
| 0.05 | 70483 |
| 0.1 | 45715 |
| 0.15 | 36814 |
| 0.2 | 33148 |
| 0.25 | 31554 |
| 0.3 | 30910 |
| 0.35 | 30758 |
| 0.4 | 30895 |
| 0.45 | 31224 |
| 0.5 | 31703 |
| 0.55 | 32314 |
| 0.6 | 33060 |
| 0.65 | 33948 |
| 0.7 | 34994 |
| 0.75 | 36217 |
| 0.8 | 37642 |
| 0.85 | 39295 |
| 0.9 | 41210 |
| 0.95 | 43423 |
| 1 | 45982 |
alpha_hard_simple <- ses(hard, alpha=NULL, initial="simple", h=4)
summary(alpha_hard_simple)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = hard, h = 4, initial = "simple", alpha = NULL)
##
## Smoothing parameters:
## alpha = 0.3473
##
## Initial states:
## l = 139
##
## sigma: 32.0198
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 9.72952 32.01982 26.34467 3.104211 13.05063 0.7860035
## ACF1
## Training set -0.1629042
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 240.3808 199.3457 281.4158 177.6231 303.1385
## 32 240.3808 196.9410 283.8206 173.9453 306.8162
## 33 240.3808 194.6625 286.0990 170.4608 310.3008
## 34 240.3808 192.4924 288.2691 167.1418 313.6197
From Table 1.10 we can see that the optimal value of α is \(0.3473\) which falls in line with our estimated range of \(0.30\) and \(0.35\) that we made previously and later fine-tuned to \(0.35\) using Table 1.9. We can also see that the \(0.027\) difference from our initial estimate to the SES result is less than when we plotted the paper series. alpha_hard_optimal <- ses(hard, alpha=NULL, initial="optimal", h=4)
summary(alpha_hard_optimal)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = hard, h = 4, initial = "optimal", alpha = NULL)
##
## Smoothing parameters:
## alpha = 0.3283
##
## Initial states:
## l = 149.2836
##
## sigma: 31.931
##
## AIC AICc BIC
## 315.8506 316.7737 320.0542
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 9.166918 31.93101 26.7731 2.636328 13.39479 0.7987858
## ACF1
## Training set -0.1417817
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 239.5602 198.6390 280.4815 176.9766 302.1439
## 32 239.5602 196.4905 282.6299 173.6908 305.4297
## 33 239.5602 194.4443 284.6762 170.5613 308.5591
## 34 239.5602 192.4869 286.6336 167.5677 311.5527
Setting initial="optimal", we see that the alpha value has dropped to \(0.3283\) but that the initial state values have increased from \(139\) to \(149.2836\) but, unlike with the paper series, we observed gains in RMSE, MAPE, and MASE. paperback and hardback series.
#Applying Holt's linear method to paperback
holt_paper <- holt(paper, initial="simple", h=4)
summary(holt_paper)
##
## Forecast method: Holt's method
##
## Model Information:
## Holt's method
##
## Call:
## holt(y = paper, h = 4, initial = "simple")
##
## Smoothing parameters:
## alpha = 0.2984
## beta = 0.4984
##
## Initial states:
## l = 199
## b = -27
##
## sigma: 39.5463
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 7.769844 39.54634 33.5377 1.633306 18.19621 0.8457332
## ACF1
## Training set -0.1088681
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 222.0201 171.3394 272.7007 144.51068 299.5295
## 32 229.6904 164.8872 294.4935 130.58245 328.7983
## 33 237.3606 145.1175 329.6038 96.28696 378.4343
## 34 245.0309 115.5211 374.5407 46.96280 443.0991
#Applying Holt's linear method to paperback
holt_hard <- holt(hard, initial="simple", h=4)
summary(holt_hard)
##
## Forecast method: Holt's method
##
## Model Information:
## Holt's method
##
## Call:
## holt(y = hard, h = 4, initial = "simple")
##
## Smoothing parameters:
## alpha = 0.439
## beta = 0.1574
##
## Initial states:
## l = 139
## b = -11
##
## sigma: 35.0438
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 7.193267 35.04383 27.99174 2.423793 14.18241 0.8351445
## ACF1
## Training set -0.07743714
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 31 250.7889 205.8784 295.6993 182.1042 319.4735
## 32 254.7003 202.4087 306.9918 174.7273 334.6733
## 33 258.6117 196.3181 320.9052 163.3419 353.8815
## 34 262.5231 187.9903 337.0558 148.5350 376.5111
#Holt's on Paperback
sum(residuals(holt_paper)^2)
## [1] 46917.39
#SES Simple for the Paperback series
alpha_paper_simple$model$SSE
## [1] 36313.98
#SES Optimial for the Paperback series
sum(residuals(alpha_paper_optimal)^2)
## [1] 33944.82
When it comes to the paper time series, it appears as though the simple exponential smoothing model with the parameter initial="optimal" has the lowest SSE of \(33944.82\). #Holt's on Hardcover
sum(residuals(holt_hard)^2)
## [1] 36842.1
#SES Simple for the Hardcover series
alpha_hard_simple$model$SSE
## [1] 30758.07
#SES Optimial for the Hardcover series
sum(residuals(alpha_hard_optimal)^2)
## [1] 30587.69
And much like with the paper time series, the simple exponential smoothing model with the parameter initial="optimal" has the lowest SSE of \(30587.69\) for the hard times series.autoplot(paper) + xlab("Time") + ylab("Daily sales of paperback books") + autolayer(alpha_paper_simple, PI=FALSE, series="SES Simple") + autolayer(alpha_paper_optimal, PI=FALSE, series="SES Optimal") + autolayer(holt_paper, series="Holt's", PI=FALSE)
Now we must check the accuracy of the three forecast models in the
paper time series.
#Accuracy of SES Simple on Paperback series
accuracy(alpha_paper_simple$fitted, paper)
## ME RMSE MAE MPE MAPE ACF1
## Test set 1.749509 34.79175 28.64424 -2.770157 16.56938 -0.1268119
## Theil's U
## Test set 0.6807692
#Accuracy of SES Optimal on Paperback series
accuracy(alpha_paper_optimal$fitted, paper)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set 7.176212 33.63769 27.8431 0.4737524 15.57782 -0.2117579 0.6685721
#Accuracy of Holt's on Paperback series
accuracy(holt_paper$fitted, paper)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set 7.769844 39.54634 33.5377 1.633306 18.19621 -0.1088681 0.8763663
Based on the results, it appears that the RMSE, MAE, and MAPE values are lowest using the parameter initial="optimal".hard time series and check the accuracy for each forecast model. autoplot(hard) + xlab("Time") + ylab("Daily sales of hardcover books") + autolayer(alpha_hard_simple, PI=FALSE, series="SES Simple") + autolayer(alpha_hard_optimal, PI=FALSE, series="SES Optimal") + autolayer(holt_hard, series="Holt's", PI=FALSE)
#Accuracy of SES Simple on Hardcover series
accuracy(alpha_hard_simple$fitted, hard)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set 9.72952 32.01982 26.34467 3.104211 13.05063 -0.1629042 0.8142204
#Accuracy of SES Optimal on Hardcover series
accuracy(alpha_hard_optimal$fitted, hard)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set 9.166918 31.93101 26.7731 2.636328 13.39479 -0.1417817 0.8059213
#Accuracy of Holt's on Hardcover series
accuracy(holt_hard$fitted, hard)
## ME RMSE MAE MPE MAPE ACF1
## Test set 7.193267 35.04383 27.99174 2.423793 14.18241 -0.07743714
## Theil's U
## Test set 0.9150588
From the accuracy results above, while the SES Optimal model had the lowest RMSE value, it was the SES Simple model that generated the lowest MAE and MAPE values. While both the SES Simple and Optimal models are relatively close to one another with regards to the accuracy results, Holt’s forecast model appears to be ineffective in comparison with higher accuracy results across the board.