2. Relationship between Moving Average and Exponential Smoothing: Assume that we apply a moving average to a series, using a very short window span. If we wanted to achieve an equivalent result using simple exponential smoothing, what value should the smoothing constant take?

If a moving average is applied to a series with a short window span, the forecast is only going to show local, and not global, trends. In order to get the two smoothers to be equal, you can use the equation w = 2/?? - 1. If alpha is set to .3, then the two smoothers would be equal if the moving average window was equal to 5.6.

5. Forecasting Department Store Sales

  1. Which of the following methods would not be suitable for forecasting this series. Explain why or why not for each one.
    -Moving average of raw series: Not suitable, because like simple exponential smoothing, it shouldn’t be used on series that exhibit seasonality or trend.
    -Moving average of deseasonalized series: Suitable because the moving averages “are powerful for visualizing trends because the averaging operation can suppress seasonality and noise, thereby making the trend more visible” (Shmueli, 80).
    -Simple exponential smoothing of the raw series: Not suitable, due to the fact that the plot clearly shows trend and seasonality, and therefore shouldn’t use simple exponential smoothing.
    -Double exponential smoothing of the raw series: Suitable because it works similar to simple exponential smoothing, but is more appropriate for series like this one that show seasonality.

b. A forecaster was tasked to generate forecasts for 4 quarters ahead. She therefore partitioned the data so that the last 4 quarters were designated as the validation period. The forecaster approached the forecasting task by using multiplicative Holt-Winter’s exponential smoothing. Specifically, you should call the hw function with the parameter seasonal=“multiplicative”. Let the method pick the appropriate parameters for ??, ??, and ??

#Creating time series:
deptsalesTS <- ts(deptsales$Sales, start=c(1,1), frequency =4)

#y-range
yrange = range(deptsalesTS)


#Set up plot
plot(c(1,7), yrange, type="n", xlab="Quarter",  ylab="Sales", bty="l", xaxt="n", yaxt="n")

#Add dept sales time series
lines(deptsalesTS, bty="l")

#Add x axis
axis(1, at=seq(1,7,1), labels=format(seq(1,7,1)))

#Add y axis
axis(2, at=seq(20000,120000,20000), labels=format(seq(20000,120000,20000), las= 2))

Make last 4 quarters the validation period

#Partition data so last 4 quarters are validation period
validlength = 4

#set training length 
trainlength= length(deptsalesTS) - validlength 

#set windows for valid and training
deptsalestrain= window(deptsalesTS, start=c(1,1), end=c(1, trainlength))
                       
deptsalesvalid = window(deptsalesTS, start=c(1,trainlength+1))
#re-plot with training and validation lines
plot(c(1,7), yrange, type="n", xlab="Quarter",  ylab="Sales", bty="l", xaxt="n", yaxt="n")

#Add x axis
axis(1, at=seq(1,7,1), labels=format(seq(1,7,1)))

#Add y axis
axis(2, at=seq(0,120000,20000), labels=format(seq(0,120000,20000), las= 2))

lines(deptsalesTS, lwd=2)

#add validation start line
abline(v=6)
text(6, 100000, "Validation")

i. Run Holt-Winter method on the data

#Holt-Winter method
hw_mult= hw(deptsalestrain, seasonal= "multiplicative", h= 4)
autoplot(deptsalestrain) + ylab("Sales") + xlab("Quarter") + autolayer(deptsalesvalid) + autolayer(hw_mult, PI=FALSE, series= "multiplicative")

summary(hw_mult)
## 
## Forecast method: Holt-Winters' multiplicative method
## 
## Model Information:
## Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = deptsalestrain, h = 4, seasonal = "multiplicative") 
## 
##   Smoothing parameters:
##     alpha = 0.4032 
##     beta  = 0.1429 
##     gamma = 0.4549 
## 
##   Initial states:
##     l = 57401.8119 
##     b = 605.4045 
##     s=1.3012 0.9795 0.8614 0.8579
## 
##   sigma:  0.0258
## 
##      AIC     AICc      BIC 
## 372.3936 390.3936 381.3552 
## 
## Error measures:
##                   ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 246.631 1499.632 977.6487 0.3530739 1.665686 0.3137009
##                     ACF1
## Training set -0.07882461
## 
## Forecasts:
##      Point Forecast    Lo 80     Hi 80    Lo 95     Hi 95
## 6 Q1       61334.90 59303.21  63366.58 58227.70  64442.09
## 6 Q2       64971.30 62529.36  67413.25 61236.67  68705.94
## 6 Q3       76718.11 73376.84  80059.37 71608.08  81828.13
## 6 Q4       99420.55 94372.29 104468.81 91699.90 107141.20

ii. MAPE for Q21 and Q22

accuracy(hw_mult, deptsalesvalid, test = 1)
##                 ME     RMSE      MAE        MPE      MAPE      MASE ACF1
## Test set -534.8988 534.8988 534.8988 -0.8797678 0.8797678 0.1716345   NA
##          Theil's U
## Test set       NaN
accuracy(hw_mult, deptsalesvalid, test = 2)
##               ME   RMSE    MAE        MPE      MAPE       MASE ACF1
## Test set -71.303 71.303 71.303 -0.1098659 0.1098659 0.02287919   NA
##           Theil's U
## Test set 0.01739097

c. The fit and the residuals were displayed in the book. Please reproduce them with R code. Using all the information from (b) and your generated figures, is this model suitable for forecasting quarters 21 and 22?

autoplot(deptsalestrain, main = "Exp Smoothing: Actual Vs Forecast", series="Actual", ylim = c(0,100000)) + xlab("Year") + ylab("Sales") + autolayer(hw_mult$fitted, series="Forecast")

autoplot(hw_mult$residuals, main = "Exp Smoothing Forecast Errors", xlab = "Year", ylab = "Forecast Error")

Based on the small MAPE, I’d say that the model is suitable.

d. Another analyst decided to take a much simpler approach, and instead of using exponential smoothing he used differencing. Use differencing to remove the trend and seasonal pattern. Which order works better: first removing trend and then seasonality or the opposite order? Show the progression of time plots as you difference the data and each final series to provide evidence in support of your answer.

There is no difference whether you remove trend or seasonality first.

#set up plot
par(mfrow = c(2,2))

#plot time series
plot(deptsalesTS, ylab="Sales", xlab="Year", bty="l", main="Dept Sales")

#plot the lag-12 difference 
plot(diff(deptsalesTS, lag=12), ylab="Lag-12", xlab="Year", bty="l", main="Lag-12 Difference")

#plot the lag-1 difference 
plot(diff(deptsalesTS, lag=1), ylab="Lag-1", xlab="Year", bty="l", main="Lag-1 Difference")

#plot the double-differenced
lag12ThenLag1 <- diff(diff(deptsalesTS, lag=12), lag=1)
plot(lag12ThenLag1, ylab="Lag-12, then Lag-1", xlab="Year", bty="l", main="Twice-Differenced (Lag-12, Lag-1)")

e. Forecast quarters 21-22 using the average of the double-differenced series from (d). Remember to use only the training period (until quarter 20), and to adjust back for the trend and seasonal pattern.

pointforecasts <- meanf(diff(diff(deptsalestrain, lag=4), lag=1), h=2)

pointforecasts
##      Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
## 6 Q1          569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q2          569.2 -2116.935 3255.335 -3714.114 4852.514
# Set up an empty vector
realForecasts <- vector()

for (i in 1:validlength) {
  if(i == 1) {
    realForecasts[i] <- pointforecasts$mean[i] + deptsalestrain[(trainlength+i)-validlength] + (deptsalestrain[trainlength] - deptsalestrain[trainlength - validlength])
  } else {
    realForecasts[i] <- pointforecasts$mean[i] + deptsalestrain[(trainlength+i)-validlength] + (realForecasts[i-1] - deptsalestrain[trainlength+i-1-validlength])
  }
}

# See what they look like
realForecasts
## [1] 63982.2 68177.4      NA      NA

f. Compare the forecasts from (e) to the exponential smoothing forecasts found in (b). Which of the two forecasting methods would you choose? Explain.

accuracy(realForecasts, deptsalesvalid)
##               ME     RMSE    MAE       MPE     MAPE ACF1 Theil's U
## Test set -3229.8 3230.151 3229.8 -5.141902 5.141902 -0.5   0.13634

I would choose the Holt-Winter’s method due to the lower MAPE score.

g. What is an even simpler approach that should be compared as a baseline?

An even simpler approach would be a naive approach, or in this case because there’s clealry seasonality, a seasonal naive approach would work.

snaivevalid <- snaive(deptsalestrain, h = validlength)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Sales", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Sales - Seasonal Naive Forecast")

axis(1,at=seq(1,7,1), labels = format(seq(1,7,1)))
axis(2, at=seq(0,120000,20000), labels=format(seq(0,120000,20000), las= 2))
lines(deptsalesTS, bty = "l")
lines(snaivevalid$mean, lwd = 2, lty = 2, col = "blue")