MBAn678 Assignment 5

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 3.4.3

library(forecast)

## Warning: package 'forecast' was built under R version 3.4.3

library(fma)

## Warning: package 'fma' was built under R version 3.4.3

library(fpp)

## Warning: package 'fpp' was built under R version 3.4.3

## Loading required package: expsmooth

## Warning: package 'expsmooth' was built under R version 3.4.3

## Loading required package: lmtest

## Warning: package 'lmtest' was built under R version 3.4.3

## Loading required package: zoo

## Warning: package 'zoo' was built under R version 3.4.2

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

## Loading required package: tseries

## Warning: package 'tseries' was built under R version 3.4.3

Chapter 7 Problem 1

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books (data set books).

a. Plot the series and discuss the main features of the data.

We’ll look at the data series.

head(books)

## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
##   Paperback Hardcover
## 1       199       139
## 2       172       128
## 3       111       172
## 4       209       139
## 5       161       191
## 6       119       168

Next we create the time series and plot.

pb <- ts(books[, 1])
hc <- ts(books[, 2])

autoplot(books) + xlab("Day")

We can tell there is an upward trend with these series, seasonality is unclear. Because it has frequency of 1, the decompose function won’t work. We could assume the seasonality is weekly and try to see a pattern.

books2 <- ts(books, frequency=7)
autoplot(decompose(books2[, 1]))

autoplot(decompose(books2[, 2]))

We do have a little over 4 “seasons” assuming the seasonality is weekly, and looking at the decomposition, both paperback and hardcover appear to be seasonal.

b. Use simple exponential smoothing with the ses function (setting initial=“simple”) and explore different values of ???? for the paperback series. Record the within-sample SSE for the one-step forecasts. Plot SSE against ???? and find which value of ???? works best. What is the effect of ???? on the forecasts?

# We will try steps at 0.05 for alpha
# Create two empty vectors
alphas <- numeric()
SSEs <- numeric()


for(i in seq(0, 1, 0.05)) {
  alphas <- c(alphas, i)
  s <- ses(pb, initial="simple", alpha=i, h=4)
  SSEs <- c(SSEs, s$model$SSE)
}

# Combine them into a dataframe and plot
pbDF <- data.frame(alphas, SSEs)
pbDF

##    alphas     SSEs
## 1    0.00 41270.00
## 2    0.05 39244.90
## 3    0.10 37785.20
## 4    0.15 36738.44
## 5    0.20 36329.34
## 6    0.25 36438.42
## 7    0.30 36930.75
## 8    0.35 37715.63
## 9    0.40 38738.40
## 10   0.45 39967.19
## 11   0.50 41383.70
## 12   0.55 42977.49
## 13   0.60 44742.62
## 14   0.65 46675.48
## 15   0.70 48773.56
## 16   0.75 51034.58
## 17   0.80 53456.14
## 18   0.85 56035.45
## 19   0.90 58769.45
## 20   0.95 61655.09
## 21   1.00 64690.00

plot(pbDF)

Looking at the plot and the table, the SSE drops at .2 and then goes up exponentially after that.

c. Now let ses select the optimal value of ????. Use this value to generate forecasts for the next four days. Compare your results with 2.

optAlphaSimple <- ses(pb, initial="simple", h=4)
summary(optAlphaSimple)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pb, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.2125 
## 
##   Initial states:
##     l = 199 
## 
##   sigma:  34.7918
## Error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 1.749509 34.79175 28.64424 -2.770157 16.56938 0.7223331
##                    ACF1
## Training set -0.1268119
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       210.1537 165.5663 254.7411 141.9631 278.3443
## 32       210.1537 164.5706 255.7368 140.4404 279.8671
## 33       210.1537 163.5962 256.7112 138.9501 281.3573
## 34       210.1537 162.6418 257.6657 137.4905 282.8170

We can see that this gives an alpha of .2125, which verifies the .2 that we chose by looking at the plot.

d. Repeat but with initial=“optimal”. How much difference does an optimal initial level make?

optAlphaOpt <- ses(pb, initial="optimal", h=4)
summary(optAlphaOpt)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pb, h = 4, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8257 
## 
##   sigma:  33.6377
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783 
## 
## Error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.176212 33.63769 27.8431 0.4737524 15.57782 0.7021303
##                    ACF1
## Training set -0.2117579
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1098 164.0013 250.2182 141.1811 273.0384
## 32       207.1098 163.3934 250.8261 140.2513 273.9682
## 33       207.1098 162.7937 251.4258 139.3342 274.8853
## 34       207.1098 162.2021 252.0174 138.4294 275.7901

Here the alpha is .1685, near the inital choice of .2 but lower.

e. Repeat steps (b)-(d) with the hardcover series.

# We will try steps at 0.05 for alpha
# Create two empty vectors
alphasH <- numeric()
SSEsH <- numeric()


for(i in seq(0, 1, 0.05)) {
  alphasH <- c(alphasH, i)
  s <- ses(hc, initial="simple", alpha=i, h=4)
  SSEsH <- c(SSEsH, s$model$SSE)
}

# Combine them into a dataframe and plot
hcDF <- data.frame(alphasH, SSEsH)
hcDF

##    alphasH     SSEsH
## 1     0.00 154503.00
## 2     0.05  70483.46
## 3     0.10  45714.82
## 4     0.15  36814.18
## 5     0.20  33148.16
## 6     0.25  31553.85
## 7     0.30  30909.69
## 8     0.35  30758.47
## 9     0.40  30895.27
## 10    0.45  31224.35
## 11    0.50  31702.60
## 12    0.55  32314.49
## 13    0.60  33059.93
## 14    0.65  33948.03
## 15    0.70  34993.95
## 16    0.75  36217.34
## 17    0.80  37641.79
## 18    0.85  39295.05
## 19    0.90  41209.53
## 20    0.95  43423.39
## 21    1.00  45982.00

plot(hcDF)

Looking at the plot and table, the lowest point seems to be .35

optAlphaSimpleH<- ses(hc, initial="simple", h=4)
summary(optAlphaSimpleH)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hc, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.3473 
## 
##   Initial states:
##     l = 139 
## 
##   sigma:  32.0198
## Error measures:
##                   ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 9.72952 32.01982 26.34467 3.104211 13.05063 0.7860035
##                    ACF1
## Training set -0.1629042
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       240.3808 199.3457 281.4158 177.6231 303.1385
## 32       240.3808 196.9410 283.8206 173.9453 306.8162
## 33       240.3808 194.6625 286.0990 170.4608 310.3008
## 34       240.3808 192.4924 288.2691 167.1418 313.6197

This alpha of .3473 is close to the first estimate of .35

optAlphaOptH <- ses(hc, initial="optimal", h=4)
summary(optAlphaOptH)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hc, h = 4, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2836 
## 
##   sigma:  31.931
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542 
## 
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 9.166918 31.93101 26.7731 2.636328 13.39479 0.7987858
##                    ACF1
## Training set -0.1417817
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5602 198.6390 280.4815 176.9766 302.1439
## 32       239.5602 196.4905 282.6299 173.6908 305.4297
## 33       239.5602 194.4443 284.6762 170.5613 308.5591
## 34       239.5602 192.4869 286.6336 167.5677 311.5527

This is lower than the estimate of .35 and the simple model which was at .3473

Chapter 7 Problem 2

Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

(a)Compare the SSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. Discuss the merits of the two forecasting methods for these data sets.

pbH <- holt(pb, initial="simple", h=4)
summary(pbH)

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = pb, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.2984 
##     beta  = 0.4984 
## 
##   Initial states:
##     l = 199 
##     b = -27 
## 
##   sigma:  39.5463
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 7.769844 39.54634 33.5377 1.633306 18.19621 0.8457332
##                    ACF1
## Training set -0.1088681
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80     Lo 95    Hi 95
## 31       222.0201 171.3394 272.7007 144.51068 299.5295
## 32       229.6904 164.8872 294.4935 130.58245 328.7983
## 33       237.3606 145.1175 329.6038  96.28696 378.4343
## 34       245.0309 115.5211 374.5407  46.96280 443.0991

hcH <- holt(hc, initial="simple", h=4)
summary(hcH)

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = hc, h = 4, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.439 
##     beta  = 0.1574 
## 
##   Initial states:
##     l = 139 
##     b = -11 
## 
##   sigma:  35.0438
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 7.193267 35.04383 27.99174 2.423793 14.18241 0.8351445
##                     ACF1
## Training set -0.07743714
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.7889 205.8784 295.6993 182.1042 319.4735
## 32       254.7003 202.4087 306.9918 174.7273 334.6733
## 33       258.6117 196.3181 320.9052 163.3419 353.8815
## 34       262.5231 187.9903 337.0558 148.5350 376.5111

Next we will compare the SSE for the various forecasts for paperbacks.

# Paperback Holt
sum(residuals(pbH)^2)

## [1] 46917.39

# Paperback ses simple
optAlphaSimple$model$SSE

## [1] 36313.98

# Paperback ses optimal
sum(residuals(optAlphaOpt)^2)

## [1] 33944.82

The simple exponential smoothing model with the parameter initial=“optimal” has the lowest SSE for the paperback time series.

Now we do the same for the hardcover series.

# Hardcover Holt
sum(residuals(hcH)^2)

## [1] 36842.1

# Hardcover ses simple
optAlphaSimpleH$model$SSE

## [1] 30758.07

# Hardcover ses optimal
sum(residuals(optAlphaOptH)^2)

## [1] 30587.69

Here we see that also for the hardcover time series, the SES model with optimal has the lowers SSE

(b) Compare the forecasts for the two series using both methods. Which do you think is best?

We will plot the three forecasts for paperback.

autoplot(pb) + xlab("Day") + autolayer(optAlphaSimple, PI=FALSE, series="SES simple") + autolayer(optAlphaOpt, PI=FALSE, series="SES optimal") + autolayer(pbH, series="Holt", PI=FALSE)

We will check the accuracy of the three forecasts for paperback.

# SES Simple
accuracy(optAlphaSimple$fitted, pb)

##                ME     RMSE      MAE       MPE     MAPE       ACF1
## Test set 1.749509 34.79175 28.64424 -2.770157 16.56938 -0.1268119
##          Theil's U
## Test set 0.6807692

# SES Optimal
accuracy(optAlphaOpt$fitted, pb)

##                ME     RMSE     MAE       MPE     MAPE       ACF1 Theil's U
## Test set 7.176212 33.63769 27.8431 0.4737524 15.57782 -0.2117579 0.6685721

# Holt Linear Trend
accuracy(pbH$fitted, pb)

##                ME     RMSE     MAE      MPE     MAPE       ACF1 Theil's U
## Test set 7.769844 39.54634 33.5377 1.633306 18.19621 -0.1088681 0.8763663

Here we see the SES optimal forecast has the lowest RMSE and MAE.

Next we will look at the hardcover forecasts.

autoplot(hc) + xlab("Day") + autolayer(optAlphaSimpleH, PI=FALSE, series="SES simple") + autolayer(optAlphaOptH, PI=FALSE, series="SES optimal") + autolayer(hcH, series="Holt", PI=FALSE)

We then check the accuracy of the three forecasts.

# SES Simple
accuracy(optAlphaSimpleH$fitted, hc)

##               ME     RMSE      MAE      MPE     MAPE       ACF1 Theil's U
## Test set 9.72952 32.01982 26.34467 3.104211 13.05063 -0.1629042 0.8142204

# SES Optimal
accuracy(optAlphaOptH$fitted, hc)

##                ME     RMSE     MAE      MPE     MAPE       ACF1 Theil's U
## Test set 9.166918 31.93101 26.7731 2.636328 13.39479 -0.1417817 0.8059213

# Holt Linear Trend
accuracy(hcH$fitted, hc)

##                ME     RMSE      MAE      MPE     MAPE        ACF1
## Test set 7.193267 35.04383 27.99174 2.423793 14.18241 -0.07743714
##          Theil's U
## Test set 0.9150588

We see that the simple SES model has the lowest MAE while the optimal model has the lowest RMSE. Both forecasts are nearly identical, so really either one is shown to be better than holts for this particular series.

(c) Calculate a 95% prediction interval for the first forecast for each series using both methods, assuming normal errors. Compare your forecasts with those produced by R.

We will calculate only the first forecast (h=1) for look at the confidence intervals. We will use the optimal SES method.

pbSESOpt <- ses(pb, initial="optimal", h=1)
summary(pbSESOpt)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pb, h = 1, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8257 
## 
##   sigma:  33.6377
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783 
## 
## Error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.176212 33.63769 27.8431 0.4737524 15.57782 0.7021303
##                    ACF1
## Training set -0.2117579
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1098 164.0013 250.2182 141.1811 273.0384

pbHolt <- holt(pb, initial="simple", h=1)
summary(pbHolt)

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = pb, h = 1, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.2984 
##     beta  = 0.4984 
## 
##   Initial states:
##     l = 199 
##     b = -27 
## 
##   sigma:  39.5463
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 7.769844 39.54634 33.5377 1.633306 18.19621 0.8457332
##                    ACF1
## Training set -0.1088681
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       222.0201 171.3394 272.7007 144.5107 299.5295

For SES, the range is 141 to 273 compared to the holt method of 144 to 299.

hcSESOpt <- ses(hc, initial="optimal", h=1)
summary(hcSESOpt)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hc, h = 1, initial = "optimal") 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2836 
## 
##   sigma:  31.931
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542 
## 
## Error measures:
##                    ME     RMSE     MAE      MPE     MAPE      MASE
## Training set 9.166918 31.93101 26.7731 2.636328 13.39479 0.7987858
##                    ACF1
## Training set -0.1417817
## 
## Forecasts:
##    Point Forecast   Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5602 198.639 280.4815 176.9766 302.1439

hcHolt <- holt(hc, initial="simple", h=1)
summary(hcHolt)

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = hc, h = 1, initial = "simple") 
## 
##   Smoothing parameters:
##     alpha = 0.439 
##     beta  = 0.1574 
## 
##   Initial states:
##     l = 139 
##     b = -11 
## 
##   sigma:  35.0438
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 7.193267 35.04383 27.99174 2.423793 14.18241 0.8351445
##                     ACF1
## Training set -0.07743714
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.7889 205.8784 295.6993 182.1042 319.4735

MBAn678 Assignment 5

Sean McGowan

February 28, 2018

Chapter 7 Problem 1

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books (data set books).

a. Plot the series and discuss the main features of the data.

We’ll look at the data series.

Next we create the time series and plot.

We can tell there is an upward trend with these series, seasonality is unclear. Because it has frequency of 1, the decompose function won’t work. We could assume the seasonality is weekly and try to see a pattern.

We do have a little over 4 “seasons” assuming the seasonality is weekly, and looking at the decomposition, both paperback and hardcover appear to be seasonal.

Looking at the plot and the table, the SSE drops at .2 and then goes up exponentially after that.

c. Now let ses select the optimal value of ????. Use this value to generate forecasts for the next four days. Compare your results with 2.

We can see that this gives an alpha of .2125, which verifies the .2 that we chose by looking at the plot.

d. Repeat but with initial=“optimal”. How much difference does an optimal initial level make?

Here the alpha is .1685, near the inital choice of .2 but lower.

e. Repeat steps (b)-(d) with the hardcover series.

Looking at the plot and table, the lowest point seems to be .35

This alpha of .3473 is close to the first estimate of .35

This is lower than the estimate of .35 and the simple model which was at .3473

Chapter 7 Problem 2

Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

(a)Compare the SSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. Discuss the merits of the two forecasting methods for these data sets.

Next we will compare the SSE for the various forecasts for paperbacks.

The simple exponential smoothing model with the parameter initial=“optimal” has the lowest SSE for the paperback time series.

Now we do the same for the hardcover series.

Here we see that also for the hardcover time series, the SES model with optimal has the lowers SSE

(b) Compare the forecasts for the two series using both methods. Which do you think is best?

We will plot the three forecasts for paperback.

We will check the accuracy of the three forecasts for paperback.

Here we see the SES optimal forecast has the lowest RMSE and MAE.

Next we will look at the hardcover forecasts.

We then check the accuracy of the three forecasts.

We see that the simple SES model has the lowest MAE while the optimal model has the lowest RMSE. Both forecasts are nearly identical, so really either one is shown to be better than holts for this particular series.

(c) Calculate a 95% prediction interval for the first forecast for each series using both methods, assuming normal errors. Compare your forecasts with those produced by R.

We will calculate only the first forecast (h=1) for look at the confidence intervals. We will use the optimal SES method.

For SES, the range is 141 to 273 compared to the holt method of 144 to 299.

Here we see the SES method gives a range of 176 to 302 for hardcovers and the holts method 182 to 319.