paperback.ts <- ts(books[,1] , frequency =  7)
hardcover.ts <- ts(books[,2] , frequency =  7)

a. Plot the series and discuss the main features of the data.

autoplot((books)) + xlab("Day")

cor(books[,1] , books[,2])
## [1] 0.2234511
decomp_paper <- decompose(paperback.ts)
decomp_hard <- decompose(hardcover.ts)
autoplot(decomp_paper$seasonal) +autolayer(decomp_hard$seasonal)

cor(decomp_paper$seasonal , decomp_hard$seasonal)
## [1] 0.4291758
plot(decompose(hardcover.ts))

seasonplot(hardcover.ts)

seasonplot(paperback.ts)

I notice an upward trend for both series. Some days of the week could be correlated between the two series, but it is noisy. The seasonal components of the two series are have a linear correlation of 0.43, while the two series overall have a correlation of 0.22.

b. Use simple exponential smoothing with the ses function (setting initial=“simple”) and explore different values of alpha for the paperback series. Record the within-sample SSE for the one-step forecasts. Plot SSE against alpha and find which value of alpha works best. What is the effect of alpha on the forecasts?

alphas <- numeric()
SSEs <- numeric()

# why are people discouraged from using for loops in R insead of  the apply family.  is this just beause the apply functions are more effiecient?


for(i in seq(0, 1, 0.05)) {
  alphas <- c(alphas, i)
  s <- ses(paperback.ts, initial="simple", alpha=i, h=1)
  SSEs <- round(c(SSEs, s$model$SSE), digits = 0)
}


paperDF <- data.frame(alphas, SSEs)

paperDF
##    alphas  SSEs
## 1    0.00 41270
## 2    0.05 39245
## 3    0.10 37785
## 4    0.15 36738
## 5    0.20 36329
## 6    0.25 36438
## 7    0.30 36931
## 8    0.35 37716
## 9    0.40 38738
## 10   0.45 39967
## 11   0.50 41384
## 12   0.55 42977
## 13   0.60 44743
## 14   0.65 46675
## 15   0.70 48774
## 16   0.75 51035
## 17   0.80 53456
## 18   0.85 56035
## 19   0.90 58769
## 20   0.95 61655
## 21   1.00 64690
# datatable(paperDF , class = 'cell-border stripe'  , rownames = FALSE  ,   caption = 'Table 1: Spectrum of alphas and SSEs'  )  

plot(paperDF)

0.2 appears to be the best alpha to minimize the sse for the paperback series.

c. Now let ses select the optimal value of alpha. Use this value to generate forecasts for the next four days. Compare your results with 2.

optimal_paper <- ses(paperback.ts , initial="simple"  , h = 4)
summary(optimal_paper)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = paperback.ts, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.2125 
## 
##   Initial states:
##     l = 199 
## 
##   sigma:  34.7918
## Error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 1.749509 34.79175 28.64424 -2.770157 16.56938 0.7852414
##                    ACF1
## Training set -0.1268119
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       210.1537 165.5663 254.7411 141.9631 278.3443
## 5.428571       210.1537 164.5706 255.7368 140.4404 279.8671
## 5.571429       210.1537 163.5962 256.7112 138.9501 281.3573
## 5.714286       210.1537 162.6418 257.6657 137.4905 282.8170
accuracy(optimal_paper)
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 1.749509 34.79175 28.64424 -2.770157 16.56938 0.7852414
##                    ACF1
## Training set -0.1268119

The computer selects optimal values for alpha that are close the what I guessed from the chart. For the paperback series, I estimated an apha of 0.2, and the computer selected an optimal alpha of 0.2125.

d. Repeat but with initial=“optimal”. How much difference does an optimal initial level make?

optimal_start_paper <- ses(paperback.ts , initial="optimal"  , h = 4)
summary(optimal_start_paper)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = paperback.ts, h = 4, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8257 
## 
##   sigma:  33.6377
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783 
## 
## Error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.176212 33.63769 27.8431 0.4737524 15.57782 0.7632792
##                    ACF1
## Training set -0.2117579
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       207.1098 164.0013 250.2182 141.1811 273.0384
## 5.428571       207.1098 163.3934 250.8261 140.2513 273.9682
## 5.571429       207.1098 162.7937 251.4258 139.3342 274.8853
## 5.714286       207.1098 162.2021 252.0174 138.4294 275.7901
accuracy(optimal_start_paper)
##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.176212 33.63769 27.8431 0.4737524 15.57782 0.7632792
##                    ACF1
## Training set -0.2117579

Begining with the optimal alpha improves the fitted model by lowering the MAPE from 16.57% to 15.58%. The RMSE and MAE are also superiour when we begin with the optimal alpha for the paperback book series. Interestingly the final apha for the optimal start ends up at 0.169, compared to 0.213 for the simple start.

e. Repeat steps (b)-(d) with the hardcover series.

alphas <- numeric()
SSEs <- numeric()

# people make fun of others for using for loops in R insead of  the apply family.  is this just beause the apply functions are more effiecient?


for(i in seq(0, 1, 0.05)) {
  alphas <- c(alphas, i)
  s <- ses(hardcover.ts, initial="simple", alpha=i, h=4)
  SSEs <- round(c(SSEs, s$model$SSE), digits = 0)
}


hardDF <- data.frame(alphas, SSEs)

hardDF
##    alphas   SSEs
## 1    0.00 154503
## 2    0.05  70483
## 3    0.10  45715
## 4    0.15  36814
## 5    0.20  33148
## 6    0.25  31554
## 7    0.30  30910
## 8    0.35  30758
## 9    0.40  30895
## 10   0.45  31224
## 11   0.50  31703
## 12   0.55  32314
## 13   0.60  33060
## 14   0.65  33948
## 15   0.70  34994
## 16   0.75  36217
## 17   0.80  37642
## 18   0.85  39295
## 19   0.90  41210
## 20   0.95  43423
## 21   1.00  45982
# datatable(hardDF , class = 'cell-border stripe'  , rownames = FALSE  ,   caption = 'Table 1: Spectrum of alphas and SSEs'  )  

plot(hardDF)

The lowest SSE for the hardcover series appears to be around 0.35.

optimal_hard <- ses(hardcover.ts , initial="simple"  , h = 4)
summary(optimal_hard)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hardcover.ts, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.3473 
## 
##   Initial states:
##     l = 139 
## 
##   sigma:  32.0198
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 9.729514 32.01982 26.34467 3.104208 13.05063 0.6885538
##                    ACF1
## Training set -0.1629043
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       240.3808 199.3457 281.4158 177.6231 303.1385
## 5.428571       240.3808 196.9410 283.8206 173.9453 306.8162
## 5.571429       240.3808 194.6625 286.0990 170.4608 310.3008
## 5.714286       240.3808 192.4924 288.2692 167.1418 313.6197

For the hardcover series, I estimated and alpha of 0.35 and the computer selected an optimal alpha of 0.3473.

optimal_start_hard <- ses(hardcover.ts , initial="optimal"  , h = 4)
summary(optimal_start_hard)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hardcover.ts, h = 4, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2836 
## 
##   sigma:  31.931
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542 
## 
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 9.166918 31.93101 26.7731 2.636328 13.39479 0.6997513
##                    ACF1
## Training set -0.1417817
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       239.5602 198.6390 280.4815 176.9766 302.1439
## 5.428571       239.5602 196.4905 282.6299 173.6908 305.4297
## 5.571429       239.5602 194.4443 284.6762 170.5613 308.5591
## 5.714286       239.5602 192.4869 286.6336 167.5677 311.5527

In the case of the hardcover series, there does not appear to be an advantage gained but beginning with the optimal alpha. While the RMSE is slighly lower, the MAPE, MAE and MASE are slightly higher. The value of alpha went from 0.347 when we started with “simple” to 0.328 when we started with “optimal”.

2.

a. Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

paper_H <- holt(paperback.ts , h=4)
summary(paper_H)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = paperback.ts, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 174.6003 
##     b = 1.0606 
## 
##   sigma:  31.6618
## 
##      AIC     AICc      BIC 
## 319.3427 321.8427 326.3486 
## 
## Error measures:
##                     ME     RMSE     MAE       MPE     MAPE      MASE
## Training set -4.458116 31.66184 26.4503 -6.029264 15.84609 0.7250976
##                    ACF1
## Training set -0.1444935
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       207.0380 166.4617 247.6143 144.9820 269.0941
## 5.428571       208.0853 167.5090 248.6615 146.0292 270.1413
## 5.571429       209.1325 168.5562 249.7088 147.0764 271.1886
## 5.714286       210.1797 169.6034 250.7560 148.1236 272.2358
hard_H <- holt(hardcover.ts , h=4)
summary(hard_H)
## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = hardcover.ts, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 154.6543 
##     b = 2.9961 
## 
##   sigma:  27.4359
## 
##     AIC    AICc     BIC 
## 310.747 313.247 317.753 
## 
## Error measures:
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -2.169026 27.43588 23.34589 -3.402396 12.47686 0.6101765
##                     ACF1
## Training set -0.02685637
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 5.285714       247.3504 212.1899 282.5109 193.5770 301.1237
## 5.428571       250.3400 215.1795 285.5005 196.5667 304.1133
## 5.571429       253.3296 218.1691 288.4901 199.5563 307.1030
## 5.714286       256.3192 221.1587 291.4797 202.5459 310.0926

Compare the SSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. Discuss the merits of the two forecasting methods for these data sets.

sum(residuals(optimal_start_paper)^2)
## [1] 33944.82
optimal_hard$model$SSE
## [1] 30758.07
sum(residuals(paper_H)^2)
## [1] 30074.17
sum(residuals(hard_H)^2)
## [1] 22581.83

The Holt’s linear model has the lowest SSE for both the paperback and the hardcover series. This is as expected, because the Holt model has the ability to account for trend, which is present in both series.

b. Compare the forecasts for the two series using both methods. Which do you think is best?

autoplot(paperback.ts) + xlab("Day") + autolayer(optimal_paper, PI=FALSE, series="SES simple") + autolayer(optimal_start_paper , PI=FALSE, series="SES optimal") + autolayer(paper_H, series="Holt", PI=FALSE) + ggtitle("Paperback series with Forecasts")

accuracy(optimal_paper)
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 1.749509 34.79175 28.64424 -2.770157 16.56938 0.7852414
##                    ACF1
## Training set -0.1268119
#accuracy(optimal_paper$fitted , paperback.ts)
accuracy(optimal_start_paper)
##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.176212 33.63769 27.8431 0.4737524 15.57782 0.7632792
##                    ACF1
## Training set -0.2117579
accuracy(paper_H)
##                     ME     RMSE     MAE       MPE     MAPE      MASE
## Training set -4.458116 31.66184 26.4503 -6.029264 15.84609 0.7250976
##                    ACF1
## Training set -0.1444935
autoplot(hardcover.ts) + xlab("Day") + autolayer(optimal_hard, PI=FALSE, series="SES simple") + autolayer(optimal_start_hard , PI=FALSE, series="SES optimal") + autolayer(hard_H, series="Holt", PI=FALSE) + ggtitle("Paperback series with Forecasts")

accuracy(optimal_hard)
##                    ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 9.729514 32.01982 26.34467 3.104208 13.05063 0.6885538
##                    ACF1
## Training set -0.1629043
#accuracy(optimal_paper$fitted , paperback.ts)
accuracy(optimal_start_hard)
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 9.166918 31.93101 26.7731 2.636328 13.39479 0.6997513
##                    ACF1
## Training set -0.1417817
accuracy(hard_H)
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -2.169026 27.43588 23.34589 -3.402396 12.47686 0.6101765
##                     ACF1
## Training set -0.02685637

I think the holt linear forecast are best for both series, because they account for the upward trend and have superior fit metrics. ## c. a 95% prediction interval for the first forecast for each series using both methods, assuming normal errors. Compare your forecasts with those produced by R.

autoplot(paperback.ts) + xlab("Day") + autolayer(optimal_paper, PI=TRUE, series="SES simple") + autolayer(optimal_start_paper , PI=TRUE, series="SES optimal") + autolayer(paper_H, series="Holt", PI=TRUE) + ggtitle("Paperback series with 95% Prediction Interval Forecasts")

autoplot(hardcover.ts) + xlab("Day") + autolayer(optimal_hard, PI=TRUE, series="SES simple") + autolayer(optimal_start_hard , PI=TRUE, series="SES optimal") + autolayer(hard_H, series="Holt", PI=TRUE) + ggtitle("Hardcover series with 95% Prediction Interval Forecasts")

These prediction intervals are justifiably very wide. Both of these series are volatile, with short historys of 30 observations each.