2. Assume that we apply a moving average to a series, using a very short window span. If we wanted to achieve an equivalent result using simple exponential smoothing, what value should the smoothing constant take?

The smoothing constant in simple exponential smoothing must fall between 0 and 1. Values on the lower end indicate “slow learning,” where past values have a large bearing on forecasts. Values on the higher end indicate “fast learning,” where recent values are weighted heavier. Moving averages consist of past values averaged over consecutive periods. Therefore, it’s slow-learning and the constant would take a value of 0.

5a. Which of the following methods would not be suitable for forecasting this series? Explain why or why not for each one.

Moving average of raw series. Not suitable: Moving averages can generally only be used for series without trend and seasonality. This series has both.

Moving average of deseasonalized series. Not suitable for the same reason, although it could be used if the series was both deseasonalized and de-trended.

Simple exponential smoothing of the raw series. Not suitable: Like moving averages, this should only be used on series without trend and seasonality.

Double exponential smoothing of the raw series. Not suitable: Double exponential smoothing works on series with trend, but not seasonality.

Holt-Winter’s exponential smoothing of the raw series. Suitable, because it works on series with trend and seasonality.

5b. A forecaster was tasked to generate forecasts for 4 quarters ahead. She therefore partitioned the data so that the last 4 quarters were designated as the validation period. The forecaster approached the forecasting task by using multiplicative Holt-Winter’s exponential smoothing.

i. Run this method on the data. Request the forecasts on the validation period. (Note that the forecasted values for the validation set will be different than what the book shows.)
storeSales <- read.csv("DeptStoreSales.csv")
storeSalesTS <- ts(storeSales$Sales, freq = 4)
validLength1 <- 4
trainLength1 <- length(storeSalesTS) - validLength1
salesTrain <- window(storeSalesTS, end = c(1, trainLength1))
salesValid <- window(storeSalesTS, start = c(1, trainLength1+1))

#Run Holt-Winter's with this instruction: You should call the ets function with the parameters restrict=FALSE, model = "ZZZ" and use the smoothing constants of α=0.2, β=0.15, and γ=0.05.)
hwSales <- ets(storeSalesTS, restrict = FALSE, model = "ZZZ", alpha = 0.2, beta = 0.15, gamma = 0.05)
hwSales.pred <- forecast(hwSales, h = validLength1, level = 0)
hwSales.pred
##      Point Forecast      Lo 0      Hi 0
## 7 Q1       67462.98  67452.56  67452.56
## 7 Q2       71124.81  71126.31  71126.31
## 7 Q3       83982.60  83966.24  83966.24
## 7 Q4      112160.26 112141.17 112141.17
ii. Using the forecasts for the validation set that you came up with in i. above, compute the MAPE values for the forecasts of quarters 21-22.
accuracy(hwSales.pred, test = 1:2)
##                    ME     RMSE      MAE       MPE     MAPE MASE ACF1
## Training set 111.8239 1463.055 1458.775 0.2006111 2.931379  NaN -0.5

5c. The fit and the residuals were displayed in the book. Please reproduce them with R code. Using all the information from (b) and your generated figures, is this model suitable for forecasting quarters 21 and 22?

#For clearer presentation, I plotted the fit by year and on a scale starting at $50,000
yrange = range(storeSalesTS - validLength1)
plot(c(1, 5.55), yrange, type = "n", xlab = "Year", ylab = "Sales (tens of 000s, $)", bty = "l", xaxt = "n", yaxt = "n")
#Added lines for original series and axes
lines(salesTrain, bty = "l")
axis(1, at = seq(1, 6, 1), labels = format(seq(1, 6, 1)))
axis(2, at = seq(50000, 100000, by = 10000), labels=format(seq(5, 10, 1)), las = 2)
#Added forecast
lines(hwSales$fitted, col = "orange", lwd = 2)
#Added title, legend
legend(1, 95000, c("Forecast", "Actual"), lty = c(1, 1), col = c("orange", " black"), bty = "y")
title("Exp Smoothing: Actual Vs Forecast (Training Data)")

#Made residual range
yrange = range(hwSales.pred$residuals)
plot(c(1, 5.55), yrange, type = "n", xlab = "Years", ylab = "Forecast error", bty = "l", xaxt = "n", yaxt = "n")
#Added axes, lines
axis(1, at = seq(1, 6, 1), labels=format(seq(1, 6, 1)))
axis(2, at = seq(-2000, 3000, by = 1000), labels = format(seq(-2000, 3000, by = 1000)), las = 1)
#Added residual lines and title
lines(hwSales.pred$residuals, lwd = 2)
title("Exp Smoothing: Forecast Errors (Training Data)")

While it can be be further refined, this model is suitable for forecasting the next 2 quarters. The measured MAPE of 2.93 percent on the entire series is strong. The plot illustrates a tight fit along most of the curve.

5d. Another analyst decided to take a much simpler approach, and instead of using exponential smoothing he used differencing. Use differencing to remove the trend and seasonal pattern. Which order works better: first removing trend and then seasonality or the the opposite order? Show a the progression of time plots as you difference the data and each final series to provide evidence in support of your answer.

#Made 2x2 plot 
par(mfrow = c(2, 2))

#Plotted original time series
plot(storeSalesTS, ylab = "ARPM", xlab = "Year", bty = "l", main = "Sales")

#Plotted lag-1 difference (de-trended)
plot(diff(storeSalesTS, lag = 1), ylab = "Lag-1", xlab = "Year", bty = "l", main = "Lag-1 difference")

#Plotted lag-4 difference (deseasonalized), since data is quarterly.
plot(diff(storeSalesTS, lag = 4), ylab = "Lag-4", xlab = "Year", bty = "l", main = "Lag-4 difference")

#Plotted double-differenced set
lag1ThenLag4 <- diff(diff(storeSalesTS, lag = 1), lag = 4)
plot(lag1ThenLag4, ylab = "Lag-1, then Lag-4", xlab = "Year", bty = "l", main = "Twice-differenced (lag-1, lag-4)")

#Do everything the same, except flip Lag-1 and Lag-4 
par(mfrow = c(2, 2))

#Plotted original time series
plot(storeSalesTS, ylab = "ARPM", xlab = "Year", bty = "l", main = "Sales")

#Plotted lag-4 difference (deseasonalized), since data is quarterly
plot(diff(storeSalesTS, lag = 4), ylab = "Lag-4", xlab = "Year", bty = "l", main = "Lag-4 difference")

#Plotted lag-1 difference (de-trended)
plot(diff(storeSalesTS, lag = 1), ylab = "Lag-1", xlab = "Year", bty = "l", main = "Lag-1 difference")

#Plotted double-differenced set
lag1ThenLag4 <- diff(diff(storeSalesTS, lag = 1), lag = 4)
plot(lag1ThenLag4, ylab = "Lag-1, then Lag-4", xlab = "Year", bty = "l", main = "Twice-differenced (lag-1, lag-4)")

Nothing changes when Lag-4 is done ahead of Lag-1. This is a far simpler approach than Holt-Winter’s smoothing, but it doesn’t matter in which order the series is de-trended or deseasonalized.

5e. Forecast quarters 21-22 using the average of the double-differenced series from (d). Remember to use only the training period (until quarter 20), and to adjust back for the trend and seasonal pattern.

#Set up double-differenced forecasts 
ddForecasts <- meanf(diff(diff(salesTrain, lag = 4), lag = 1), h = 2)
realForecasts <- vector()

#Used if/then statement from lesson
for (i in 1:validLength1) {
  if(i == 1) {realForecasts[i] <- ddForecasts$mean[i] + salesTrain[(trainLength1+i)-validLength1] + (salesTrain[trainLength1] - salesTrain[trainLength1 - validLength1])
  } else {realForecasts[i] <- ddForecasts$mean[i] + salesTrain[(trainLength1+i)-validLength1] + (realForecasts[i-1] - salesTrain[trainLength1+i-1-validLength1])}}

realForecasts
## [1] 63982.2 68177.4      NA      NA

5f. Compare the forecasts from (e) to the exponential smoothing forecasts found in (b). Which of the two forecasting methods would you choose? Explain.

The double-differenced method is faster, easier and more accurate. It over-predicted the actual figures by around 3,200, while the Holt-Winter’s forecasts were over by more than 6,000. I would double-difference it.

5g. What is an even simpler approach that should be compared as a baseline? Complete that comparison.

Since we’re looking for just a baseline to the Holt-Winter’s forecast, a trailing moving average would work well.

#Set up trailing average
trailingMA <- rollmean(salesTrain, k = 4)

#Make range and plot
yrange = range(storeSalesTS)
plot(c(1, 5.55), yrange, type= "n", xlab = "Year", ylab ="Sales (tens of 000s, $)", bty = "l", xaxt = "n", yaxt = "n")

#Added original series
lines(storeSalesTS, bty = "l")

#Added axes
axis(1, at = seq(1, 6, 1), labels = format(seq(1, 6, 1)))
axis(2, at = seq(50000, 100000, by = 10000), labels = format(seq(5, 10, 1)), las = 2)

#Added Holt Winter's forecast
lines(hwSales$fitted, col = "orange", lwd = 2)

#Added trailing moving averages
lines(trailingMA, col = "blue", lwd = 2)

#Added title, legend
legend(1,95000, c("Actual sales", "Holt-Winter's forecast", "Trailing moving average"), lty = c(1, 1, 2), col = c("black", "orange", "blue"), bty = "y")
title("Department store sales: Holt-Winter's and trailing moving average")

You wouldn’t use the trailing moving average to forecast in this situation because it can’t capture trend or seasonality. However, it’s a good sanity check on the data because it reflects the general direction of the dataset — which is steadily positive in this case.

8a. Which smoothing method would you choose if you had to choose the same method for forecasting all series? Why?

A quick look at the data shows both seasonality and trend, meaning Holt-Winter’s exponential smoothing would be best. It doesn’t require further smoothing.

8b(i). You are asked to focus on fortified wine sales alone and produce as accurate a forecast as possible for the next 2 months. Start by partitioning the data using the period until Dec-1993 as the training period.

#Import, partition
wine <- read.csv("AustralianWines.csv")
fWineTS1 <- ts(wine$Fortified, start = c(1980,1), freq=12)
fwineTS <- na.omit(fWineTS1)
validLength2 <- 12
trainLength2 <- length(fwineTS) - validLength2
wineTrain <- window(fwineTS, start = c(1980,1), end= c(1980, trainLength2))
wineValid <- window(fwineTS, start = c(1980, trainLength2+1), end = c(1980, trainLength2+validLength2))
ii. Apply Holt-Winter’s exponential smoothing (with multiplicative seasonality) to sales.
#Get forecasts for 1994 using Holt-Winter's model "ZNM," indicating multiplicative seasonality with default smoothing constant
hwWine <- ets(wineTrain, restrict = FALSE, model = "ZNM", alpha = 0.2)
hWine.pred <- forecast(hwWine, h = validLength2, level = 0)
hWine.pred
##          Point Forecast     Lo 0     Hi 0
## Jan 1994       1309.524 1309.524 1309.524
## Feb 1994       1548.440 1548.440 1548.440
## Mar 1994       1918.531 1918.531 1918.531
## Apr 1994       2118.575 2118.575 2118.575
## May 1994       2544.822 2544.822 2544.822
## Jun 1994       2583.974 2583.974 2583.974
## Jul 1994       3108.452 3108.452 3108.452
## Aug 1994       2829.059 2829.059 2829.059
## Sep 1994       2142.791 2142.791 2142.791
## Oct 1994       1984.830 1984.830 1984.830
## Nov 1994       2355.221 2355.221 2355.221
## Dec 1994       2587.823 2587.823 2587.823

8c. Create a plot for the residuals from the Holt-Winter’s exponential smoothing.

residuals <- hWine.pred$residuals
plot(residuals, type = "n", xlab = "Year", ylab = "Forecast error (%)", bty = "l", xaxt = "n", yaxt = "n")
axis(1, at = seq(1980, 1994, 1), labels = format(seq(1980, 1994, 1)))
axis(2, at = seq(-0.2, 0.2, 0.1), labels = format(seq(-0.2, 0.2, 0.1)), las = 1)
lines(residuals)
title("Forecast errors on training set")

i. Based on this plot, which of the following statements are reasonable?

Decembers are not captured well by the model. Reasonable: The model often underpredicts December numbers by some of the biggest margins.

There is a strong correlation between sales on the same calendar month. Reasonable: The peaks and valleys of this plot for certain months roughly correspond to each other’s numbers for past years.

The model does not capture the seasonality well. Unreasonable: While it’s not perfect, seasonality is captured here by the Holt-Winter’s model.

We should first deseasonalize the data and then apply Holt-Winter’s exponential smoothing. Unreasonable: That’s not necessary because the model is equipped to handle seasonality.

ii. How can you handle the above effect with exponential smoothing?

This version of Holt-Winter’s exponential smoothing gets us in a reasonable range for forecasting this series, but it could be improved. The problem with the December numbers illuminated in question 8c(i) indicates that this series could have more than one seasonal pattern, which can be handled by using the double-seasonal Holt-Winter’s exponential smoothing method.