Question 2

Relationship between Moving Average and Exponential Smoothing: Assume that we apply a moving average to a series, using a very short window span. If we wanted to achieve an equivalent result using simple exponential smoothing, what value should the smoothing constant take?

In simple exponential smoothing, the smoothing constant is between 0 and 1. The closer the constant is to 1, the less impact the smoothing has so it’s closer to the original sample set. In MA, the process is just taking an average of window span of n. For very small windows, 1/n comes close to an equivalent result between MA and SES. As the window gets wider, in SES the weight gets smaller as it gets further from the point being predicted so the near equivalent between MA and SES may not be as close if the window span is wide.

Question 5: Forecasting Department Store Sales

a. Which of the following methods would not be suitable for forecasting this series. Explain why or why not for each one.

b. A forecaster was tasked to generate forecasts for 4 quarters ahead. She therefore partitioned the data so that the last 4 quarters were designated as the validation period. The forecaster approached the forecasting task by using multiplicative Holt-Winter’s exponential smoothing. Specifically, you should call the ets function with the parameters restrict=FALSE, model = “ZZZ” and use the smoothing constants of α=0.2, β=0.15, and γ=0.05.

i. Run this method on the data. Request the forecasts on the validation period.(Note that the forecasted values for the validation set will be different than what the book shows.)
StoreSales <- read.csv("/Users/wendyhayes/Desktop/MBA 678-Predictive Analytics/DeptStoreSales.csv", stringsAsFactors=FALSE)
str(StoreSales)
## 'data.frame':    24 obs. of  2 variables:
##  $ Quarter: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Sales  : int  50147 49325 57048 76781 48617 50898 58517 77691 50862 53028 ...
StoreSales
##    Quarter  Sales
## 1        1  50147
## 2        2  49325
## 3        3  57048
## 4        4  76781
## 5        5  48617
## 6        6  50898
## 7        7  58517
## 8        8  77691
## 9        9  50862
## 10      10  53028
## 11      11  58849
## 12      12  79660
## 13      13  51640
## 14      14  54119
## 15      15  65681
## 16      16  85175
## 17      17  56405
## 18      18  60031
## 19      19  71486
## 20      20  92183
## 21      21  60800
## 22      22  64900
## 23      23  76997
## 24      24 103337
StoreSalesTS <- ts(StoreSales$Sales, start=c(1,1), frequency=4)
yrange <- c(45000,110000)

plot(c(1, 8), yrange, type="n", xlab="Quarter",  ylab="Store Sales (in thousands)", bty="l", xaxt="n", yaxt="n")
title("Store Sales by Quarter")
lines(StoreSalesTS, bty="l")
axis(1, at=seq(1,8,1), labels=format(seq(1,8,1)))
axis(2, at=seq(45000,110000,5000), labels=format(seq(45,110,5)), las=2)

validLength <- 4
trainLength <- length(StoreSalesTS)-validLength
trainLength
## [1] 20
SalesTrain <- window(StoreSalesTS, end=c(1,trainLength))
SalesTrain
##    Qtr1  Qtr2  Qtr3  Qtr4
## 1 50147 49325 57048 76781
## 2 48617 50898 58517 77691
## 3 50862 53028 58849 79660
## 4 51640 54119 65681 85175
## 5 56405 60031 71486 92183
SalesValid <- window(StoreSalesTS, start=c(1,trainLength +1))


library(forecast)

sesSales <- ets(SalesTrain, model ="ZZZ", alpha = 0.2, beta = 0.15, gamma = 0.05, restrict = FALSE)
sesSales
## ETS(M,A,M) 
## 
## Call:
##  ets(y = SalesTrain, model = "ZZZ", alpha = 0.2, beta = 0.15,  
## 
##  Call:
##      gamma = 0.05, restrict = FALSE) 
## 
##   Smoothing parameters:
##     alpha = 0.2 
##     beta  = 0.15 
##     gamma = 0.05 
## 
##   Initial states:
##     l = 58050.3372 
##     b = -2.1628 
##     s=1.2982 0.9948 0.8645 0.8425
## 
##   sigma:  0.0232
## 
##      AIC     AICc      BIC 
## 361.8442 368.3058 367.8186
sesSalesPredictions <- forecast(sesSales, h=validLength, level=0)

sesSalesPredictions
##      Point Forecast      Lo 0      Hi 0
## 6 Q1       62115.16  62115.16  62115.16
## 6 Q2       65371.60  65371.60  65371.60
## 6 Q3       77076.87  77076.87  77076.87
## 6 Q4      102937.73 102937.73 102937.73
SalesValid
##     Qtr1   Qtr2   Qtr3   Qtr4
## 6  60800  64900  76997 103337
ii. Using the forecasts for the validation set that you came up with in i. above, compute the MAPE values for the forecasts of quarters 21-22.
accuracy(sesSalesPredictions,SalesValid)
##                     ME      RMSE       MAE        MPE      MAPE      MASE
## Training set  609.5633 1447.1798 1240.9947  0.9315256 1.9850921 0.3982014
## Test set     -366.8394  727.6403  566.4741 -0.6517749 0.8449629 0.1817661
##                     ACF1  Theil's U
## Training set 0.008503999         NA
## Test set     0.183050115 0.02380343

The MAPE for the test set is 0.844962, less than 1%, so it is good. This indicates that the model predicted the training set very well.

c. The fit and the residuals were displayed in the book. Please reproduce them with R code. Using all the information from (b) and your generated figures, is this model suitable for forecasting quarters 21 and 22?

plot(c(1, 8), yrange, type="n", xlab="Quarter",  ylab="Store Sales (in thousands)", bty="l", xaxt="n", yaxt="n")
title("Exponential Smoothing with Holt-Winters Model")
legend(7.00,110000,bty = "n", legend=c("Data","Training","Validation"), col=c(1,2,3),pch=16)
lines(StoreSalesTS, bty="l")
axis(1, at=seq(1,7,1), labels=format(seq(1,7,1)))
axis(2, at=seq(45000,110000,5000), labels=format(seq(45,110,5)), las=2)
lines(sesSales$fitted,col=2)
lines(sesSalesPredictions$mean,col=3)
str(sesSalesPredictions)
## List of 9
##  $ model    :List of 18
##   ..$ loglik    : num -175
##   ..$ aic       : num 362
##   ..$ bic       : num 368
##   ..$ aicc      : num 368
##   ..$ mse       : num 2094329
##   ..$ amse      : num 2710560
##   ..$ fit       :List of 4
##   .. ..$ value  : num 350
##   .. ..$ par    : num [1:5] 58050.337 -2.163 1.298 0.995 0.865
##   .. ..$ fail   : int 0
##   .. ..$ fncount: int 557
##   ..$ residuals : Time-Series [1:20] from 1 to 5.75: 0.0254 -0.0258 -0.0156 0.021 -0.0099 ...
##   ..$ fitted    : Time-Series [1:20] from 1 to 5.75: 48904 50629 57953 75201 49103 ...
##   ..$ states    : Time-Series [1:21, 1:6] from 0.75 to 5.75: 58050 58343 58261 58072 58172 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   .. .. ..$ : NULL
##   .. .. ..$ : chr [1:6] "l" "b" "s1" "s2" ...
##   ..$ par       : Named num [1:8] 58050.337 -2.163 1.298 0.995 0.865 ...
##   .. ..- attr(*, "names")= chr [1:8] "l" "b" "s0" "s1" ...
##   ..$ m         : num 4
##   ..$ method    : chr "ETS(M,A,M)"
##   ..$ components: chr [1:4] "M" "A" "M" "FALSE"
##   ..$ call      : language ets(y = SalesTrain, model = "ZZZ", alpha = 0.2, beta = 0.15, gamma = 0.05,      restrict = FALSE)
##   ..$ initstate : Named num [1:6] 58050.337 -2.163 1.298 0.995 0.865 ...
##   .. ..- attr(*, "names")= chr [1:6] "l" "b" "s1" "s2" ...
##   ..$ sigma2    : num 0.000538
##   ..$ x         : Time-Series [1:20] from 1 to 5.75: 50147 49325 57048 76781 48617 50898 58517 77691 50862 53028 ...
##   ..- attr(*, "class")= chr "ets"
##  $ mean     : Time-Series [1:4] from 6 to 6.75: 62115 65372 77077 102938
##  $ level    : num 0
##  $ x        : Time-Series [1:20] from 1 to 5.75: 50147 49325 57048 76781 48617 50898 58517 77691 50862 53028 ...
##  $ upper    : Time-Series [1:4, 1] from 6 to 6.75: 62115 65372 77077 102938
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr "Series 1"
##  $ lower    : Time-Series [1:4, 1] from 6 to 6.75: 62115 65372 77077 102938
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr "Series 1"
##  $ fitted   : Time-Series [1:20] from 1 to 5.75: 48904 50629 57953 75201 49103 ...
##  $ method   : chr "ETS(M,A,M)"
##  $ residuals: Time-Series [1:20] from 1 to 5.75: 0.0254 -0.0258 -0.0156 0.021 -0.0099 ...
##  - attr(*, "class")= chr "forecast"
abline(v=6)
arrows(1, 110000, 6, 110000, code=3, length=0.1)
text(3.5, 105000, "Training")
abline(v=7)
arrows(6, 110000, 7, 110000, code=3, length=0.1)
text(6.5, 105000, "Validation")

modelvalues <- c(sesSales$fitted,sesSalesPredictions$mean)
residualvalues <- StoreSalesTS - modelvalues
residualvalues
##         Qtr1       Qtr2       Qtr3       Qtr4
## 1  1243.4716 -1303.9405  -905.2102  1579.9114
## 2  -486.0069   778.2687   546.8896  1542.2967
## 3   964.0418  1242.3939 -1771.9152   221.7495
## 4  -341.4924   527.7055  3521.0695  1441.8448
## 5   889.5862  1697.2893  2309.0611 -1505.7482
## 6 -1315.1565  -471.6041   -79.8663   399.2694
plot(residualvalues, ylab="residual values", bty="l")
title("Residuals")

The residuals are low, as shown by the plot and the MAPE, so it should be a good predctor.

d. Another analyst decided to take a much simpler approach, and instead of using exponential smoothing he used differencing. Use differencing to remove the trend and seasonal pattern. Which order works better: first removing trend and then seasonality or the the opposite order? Show a the progression of time plots as you difference the data and each final series to provide evidence in support of your answer.

# Set up the plot to have two rows and two columns
par(mfrow = c(2,2))
plot(StoreSalesTS, ylab="Sales", xlab="Quarter", bty="l", main="Store Sales per Quarter")

# Plot the lag-4 to deseasonalize the quarters and the lag-1 to de-trend
plot(diff(StoreSalesTS, lag=4), ylab="Lag-4", xlab="Quarter", bty="l", main="Lag-4 Difference")
plot(diff(StoreSalesTS, lag=1), ylab="Lag-1", xlab="Quarter", bty="l", main="Lag-1 Difference")
lag4ThenLag1 <- diff(diff(StoreSalesTS, lag=4), lag=1)
plot(lag4ThenLag1, ylab="Lag-4, then Lag-1", xlab="Quarter", bty="l", main="Twice-Differenced (Lag-4, Lag-1)")

# Plot the lag-4 to deseasonalize the quarters and the lag-1 to de-trend
par(mfrow = c(2,2))
plot(StoreSalesTS, ylab="Sales", xlab="Quarter", bty="l", main="Store Sales per Quarter")
plot(diff(StoreSalesTS, lag=1), ylab="Lag-1", xlab="Quarter", bty="l", main="Lag-1 Difference")
plot(diff(StoreSalesTS, lag=4), ylab="Lag-4", xlab="Quarter", bty="l", main="Lag-4 Difference")
lag1ThenLag4 <- diff(diff(StoreSalesTS, lag=1), lag=4)
plot(lag1ThenLag4, ylab="Lag-1, then Lag-4", xlab="Quarter", bty="l", main="Twice-Differenced (Lag-1, Lag-4)")

It doesn’t matter which order you do the differencing here.

e. Forecast quarters 21-22 using the average of the double-differenced series from (d). Remember to use only the training period (until quarter 20), and to adjust back for the trend and seasonal pattern.

# Plot the lag-4 to deseasonalize the quarters and the lag-1 to de-trend the training period 
par(mfrow = c(2,2))
plot(SalesTrain, ylab="Sales (Training Period)", xlab="Quarter", bty="l", main="Store Sales per Quarter (Training)")
plot(diff(SalesTrain, lag=1), ylab="Lag-1", xlab="Quarter", bty="l", main="Lag-1 Difference")
plot(diff(SalesTrain, lag=4), ylab="Lag-4", xlab="Quarter", bty="l", main="Lag-4 Difference")
lag1ThenLag4 <- diff(diff(SalesTrain, lag=1), lag=4)
plot(lag1ThenLag4, ylab="Lag-1, then Lag-4", xlab="Quarter", bty="l", main="Twice-Differenced (Lag-1, Lag-4)")

SalesAverage <- meanf(diff(diff(SalesTrain, lag=4), lag=1), h=4)
SalesAverage
##      Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
## 6 Q1          569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q2          569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q3          569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q4          569.2 -2116.935 3255.335 -3714.114 4852.514
SalesForecasts <- vector()

for (i in 1:2) {
  if(i == 1) {
    SalesForecasts[i] <- SalesAverage$mean[i] + SalesTrain[(trainLength+i)-validLength] + (SalesTrain[trainLength] - SalesTrain[trainLength - validLength])
  } else {
    SalesForecasts[i] <- SalesAverage$mean[i] + SalesTrain[(trainLength+i)-validLength] + (SalesForecasts[i-1] - SalesTrain[trainLength+i-1-validLength])
  }
}

SalesForecasts
## [1] 63982.2 68177.4
par(mfrow = c(1,1))
plot(SalesForecasts, type="l", bty="l",col=2)

f. Compare the forecasts from (e) to the exponential smoothing forecasts found in (b). Which of the two forecasting methods would you choose? Explain.

accuracy(SalesForecasts,SalesValid[1:2])
##               ME     RMSE    MAE       MPE     MAPE
## Test set -3229.8 3230.151 3229.8 -5.141902 5.141902

The double difference model has a MAPE of a little over 5% whereas the Holt-Winters model had a MAPE of under 1%, indicating that the Holt-Winters was a better predictor. Therefore I would the Holt-Winters

g. What is an even simpler approach that should be compared as a baseline? Complete that comparison.

naiveSales <- snaive(SalesTrain, h=validLength)
naiveSales
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 6 Q1          56405 51434.14 61375.86 48802.72 64007.28
## 6 Q2          60031 55060.14 65001.86 52428.72 67633.28
## 6 Q3          71486 66515.14 76456.86 63883.72 79088.28
## 6 Q4          92183 87212.14 97153.86 84580.72 99785.28
accuracy(naiveSales, SalesValid)
##                   ME     RMSE     MAE      MPE     MAPE     MASE
## Training set 2925.25 3878.784 3116.50 4.344358 4.737739 1.000000
## Test set     6482.25 7032.176 6482.25 8.170540 8.170540 2.079978
##                    ACF1 Theil's U
## Training set 0.54855812        NA
## Test set     0.01334402 0.4705349

The naive forecast has a MAPE of a bit over 8 percent, making it the worst predictor of the three. The data had a trend so this shouldn’t be a surprise that the naive forecast was not as good.

Question 8. Forecasting Australian Wine Sales: Figure 5.14 shows time plots of monthly sales of six types of Australian wines (red, rose, sweet white, dry white, sparkling, and fortified) for 1980-1994. Data available in AustralianWines.xls. The units are thousands of liters. You are hired to obtain short-term forecasts (2-3 months ahead) for each of the six series, and this task will be repeated every month.

(a) Which smoothing method would you choose if you had to choose the same method for forecasting all series? Why?

I would choose the Holt-Winters because it would find the trend and seasonality. It works when there’s no trend or seasonality as well. Looking at the plots, it’s not clear that all six types of wine have trends.

(b) Fortified wine has the largest market share of the six types of wine. You are asked to focus on fortified wine sales alone and produce as accurate a forecast as possible for the next two months.

• Start by partitioning the data using the period until Dec- 1993 as the training period.

WineSales <- read.csv("/Users/wendyhayes/Desktop/MBA 678-Predictive Analytics/AustralianWines.csv", stringsAsFactors=FALSE)
str(WineSales)
## 'data.frame':    188 obs. of  7 variables:
##  $ Month      : chr  "Jan-80" "Feb-80" "Mar-80" "Apr-80" ...
##  $ Fortified  : int  2585 3368 3210 3111 3756 4216 5225 4426 3932 3816 ...
##  $ Red        : int  464 675 703 887 1139 1077 1318 1260 1120 963 ...
##  $ Rose       : chr  "112" "118" "129" "99" ...
##  $ sparkling  : int  1686 1591 2304 1712 1471 1377 1966 2453 1984 2596 ...
##  $ Sweet.white: int  85 89 109 95 91 95 96 128 124 111 ...
##  $ Dry.white  : int  1954 2302 3054 2414 2226 2725 2589 3470 2400 3180 ...
FortifiedSales <- WineSales$Fortified
str(FortifiedSales)
##  int [1:188] 2585 3368 3210 3111 3756 4216 5225 4426 3932 3816 ...
FortifiedSalesTS <- ts(FortifiedSales, start=c(1980,1), end=c(1994,12), frequency=12)
plot(FortifiedSalesTS, ylab="Sales of Fortified Wines")
title("Sales of Fortified Australian Wines")

validLength <- 12
trainLength <- length(FortifiedSalesTS)-validLength
WineTrain <- window(FortifiedSalesTS, start=c(1980,1),end=c(1980,trainLength))
str(WineTrain)
##  Time-Series [1:168] from 1980 to 1994: 2585 3368 3210 3111 3756 4216 5225 4426 3932 3816 ...
WineValid <- window(FortifiedSalesTS, start=c(1980,trainLength +1))
str(WineValid)
##  Time-Series [1:12] from 1994 to 1995: 1154 1568 1965 2659 2354 2592 2714 2294 2416 2016 ...

• Apply Holt-Winter’s exponential smoothing (with multiplicative seasonality) to sales.

sesFortifiedSales <- ets(WineTrain, model ="ZZM", restrict = FALSE)
str(sesFortifiedSales)
## List of 18
##  $ loglik    : num -1361
##  $ aic       : num 2755
##  $ bic       : num 2808
##  $ aicc      : num 2759
##  $ mse       : num 82868
##  $ amse      : num 82723
##  $ fit       :List of 4
##   ..$ value  : num 2721
##   ..$ par    : num [1:16] 5.55e-02 9.13e-04 1.03e-04 4.04e+03 -6.80 ...
##   ..$ fail   : int 1
##   ..$ fncount: int 2001
##  $ residuals : Time-Series [1:168] from 1980 to 1994: 0.0857 0.1898 -0.0765 -0.1799 -0.1564 ...
##  $ fitted    : Time-Series [1:168] from 1980 to 1994: 2381 2831 3476 3793 4452 ...
##  $ states    : Time-Series [1:169, 1:14] from 1980 to 1994: 4040 4052 4089 4065 4019 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:14] "l" "b" "s1" "s2" ...
##  $ par       : Named num [1:16] 5.55e-02 9.13e-04 1.03e-04 4.04e+03 -6.80 ...
##   ..- attr(*, "names")= chr [1:16] "alpha" "beta" "gamma" "l" ...
##  $ m         : num 12
##  $ method    : chr "ETS(M,A,M)"
##  $ components: chr [1:4] "M" "A" "M" "FALSE"
##  $ call      : language ets(y = WineTrain, model = "ZZM", restrict = FALSE)
##  $ initstate : Named num [1:14] 4040.081 -6.798 1.132 1.04 0.888 ...
##   ..- attr(*, "names")= chr [1:14] "l" "b" "s1" "s2" ...
##  $ sigma2    : num 0.00737
##  $ x         : Time-Series [1:168] from 1980 to 1994: 2585 3368 3210 3111 3756 4216 5225 4426 3932 3816 ...
##  - attr(*, "class")= chr "ets"

(c) Create a plot for the residuals from the Holt-Winter’s exponential smoothing.

plot(sesFortifiedSales$residuals,type="l",ylab="residuals")
points(sesFortifiedSales$residuals,pch=16,col=c(1:12))
title("Holt-Winters Residuals (Fortified Australian Wine Sales)")

plot(sesFortifiedSales$residuals[seq(12,length(sesFortifiedSales$residuals),12)],type="l",ylab="residuals")
points(sesFortifiedSales$residuals[seq(12,length(sesFortifiedSales$residuals),12)],pch=16,col=c(1:12))
title("Holt-Winters Residuals (Fortified Australian Wine Sales)")

colors <- rainbow(12)
linetype <- c(1:12)
plotchar <- c(1:12)
yrange <- range(FortifiedSalesTS)
plot(FortifiedSalesTS,col=0)
for (i in 1:12) {
    currentDate <- subset(FortifiedSalesTS, cycle(FortifiedSalesTS)==i)
    lines(seq(1980, 1980+length(currentDate)-1,1), currentDate,  lwd=1,
          lty=linetype[i], col=colors[i], pch=plotchar[i],type="b")
}

accuracy(WineTrain,sesFortifiedSales$fitted)
##                ME     RMSE      MAE       MPE     MAPE       ACF1
## Test set 25.32466 287.8687 224.6507 0.5674702 7.096701 0.05168201
##          Theil's U
## Test set 0.4472481

i. Based on this plot, which of the following statements are reasonable?

• Decembers (month 12) are not captured well by the model.

This is true as the errors stray relatively far from zero during Decembers.

• There is a strong correlation between sales on the same calendar month.

Most of the months seem to have a good correlation if the trend were removed for each month.

• The model does not capture the seasonality well.

This is true, the model does not capture the seasonality well.The errors are varying by season.

• We should first deseasonalize the data and then apply Holt-Winter’s exponential smoothing. ####This is not reasonable. Holt-Winter captures seasonality and trend.