In a Moving Average method, a narrow window length (w) will emphasize local patterns, but may be subject to under-smoothing. If the goal were to create an equivalent result in a Simple Exponential Smoothing method, the forecast analyst would select a smoothing constant approaching 1, which would give a stronger weighting to periods in the more recent past and also emphasize local patterns.
Moving average of raw series: Based on the plot shown in the book, this approach would not be appropriate because the series demonstrates both trend and seasonality. This forecasting level might yield an appropriate level for the prediction period, but it would not model the seasonality or trend.
Moving average of deseasonalized series: One could use differencing to deseasonalize this series, but it would still exhibit a positive trend. For this reason, moving average is still not appropriate, as it will not include a trend component in the forecast created (a double-differenced approach could be successful, but not the proposed single-differenced method).
Simple exponential smoothing of the raw series: Simple exponential smoothing would not be suitable for forecasting this series because this method cannot handle seasonality or trend, similar to the moving average over the raw series.
Double exponential smoothing of the raw series: Double exponential smoothing would not be suitable over the raw series, because this technique adds a trend component, but not a seasonality component, to the simple exponential smoothing method. This technique (Holt’s Linear Trend Model) could work if the series were differenced for seasonality, but would not be appropriate over the raw series as indicated.
Holt-Winters exponential smoothing over the raw series: This approach would be most suitable for the series presented, since it contains both a trend and seasonality component.
#Read the csv
dept <- read.csv("DptStrSls.csv", stringsAsFactors = FALSE)
#Create the ts object
deptTS <- ts(dept$Sales, start=c(1), frequency=4)
#Raw plot to validate initial steps
plot(deptTS)
#Refining plot
yrange <- range(deptTS)
plot(c(1, 7), c(48000,114500), type="n", xlab="Year", ylab="Quarterly Department Store Sales", bty="l", xaxt="n", yaxt="n")
#Add the time series deptTS
lines(deptTS, bty="l")
# Add the x-axis
axis(1, at=seq(1,7,1), labels=format(seq(1,7,1)))
# Add the y-axis
axis(2, at=seq(48000, 114500, 9500), labels=format(seq(48000, 114500, 9500)), las=0)
abline(v=6)
arrows(1, 105000, 6, 105000, code=3, length=0.1)
text(3, 108000, "Training")
abline(v=7)
arrows(6, 105000, 7, 105000, code=3, length=0.1)
text(6.5, 108000, "Validation")
# Set the length of the validation period to 4 quarters
validLength <- 4
# Set the length of the training period to everything else
trainLength <- length(deptTS) - validLength
# Partition the data into training and validation periods
deptTrain <- window(deptTS, end=c(1, trainLength))
deptValid <- window(deptTS, start=c(1,trainLength+1), end=c(1,trainLength+validLength))
# Raw plots to check my work
plot(deptTrain)
plot(deptValid)
In addition to answering the question directly, I’ll create a plot to check for reasonability of the Holt-Winters forecast:
#Attempt the Holt-Winters Forecast
dept.hw <- ets(deptTrain, model = "ZZZ", alpha = 0.2, beta = 0.15, gamma = 0.05, restrict = FALSE)
dept.hwfcst <- forecast(dept.hw, h=validLength, level=0)
#cleaned up plot
plot(c(1, 7), c(48000,114500), type="n", xlab="Year", ylab="Quarterly Department Store Sales", bty="l", xaxt="n", yaxt="n")
# Add the x-axis
axis(1, at=seq(1,7,1), labels=format(seq(1,7,1)))
# Add the y-axis
axis(2, at=seq(48000, 114500, 9500), labels=format(seq(48000, 114500, 9500)), las=0)
abline(v=6)
arrows(1, 105000, 6, 105000, code=3, length=0.1)
text(3, 108000, "Training")
abline(v=7)
arrows(6, 105000, 7, 105000, code=3, length=0.1)
text(6.5, 108000, "Validation")
#Add the time series deptTS
lines(deptTS, bty="l")
#Add the fitted and forecasted
lines(dept.hwfcst$fitted, lwd = 2, col = "blue")
lines(dept.hwfcst$mean, lwd = 2, col = "blue", lty = 2)
dept.hwfcst
## Point Forecast Lo 0 Hi 0
## 6 Q1 62115.16 62115.16 62115.16
## 6 Q2 65371.60 65371.60 65371.60
## 6 Q3 77076.87 77076.87 77076.87
## 6 Q4 102937.73 102937.73 102937.73
Please see the point forecasts for Q1-4 as printed above.
Let’s call the accuracy function over the prediction period to see the MAPE:
#Ask prof about accuracy function
#accuracy(dept.hwfcst, deptValid[1:2])
#accuracy(dept.hwfcst, test = 1:2)
accuracy(dept.hwfcst$mean[1:2], deptValid[1:2])
## ME RMSE MAE MPE MAPE
## Test set -893.3803 987.939 893.3803 -1.444875 1.444875
Based on the results of the accuracy function, we see the MAPE for the test set (validation period) for quarters 21-22 was 1.44%.
To see the fit over the training period, we can repeat the plot created in 5b. To see the residuals, we can plot those too.
#Repeat Forecast Plot
#Cleaned up plot
plot(c(1, 7), c(48000,114500), type="n", xlab="Year", ylab="Quarterly Department Store Sales", bty="l", xaxt="n", yaxt="n")
# Add the x-axis
axis(1, at=seq(1,7,1), labels=format(seq(1,7,1)))
# Add the y-axis
axis(2, at=seq(48000, 114500, 9500), labels=format(seq(48000, 114500, 9500)), las=0)
abline(v=6)
arrows(1, 105000, 6, 105000, code=3, length=0.1)
text(3, 108000, "Training")
abline(v=7)
arrows(6, 105000, 7, 105000, code=3, length=0.1)
text(6.5, 108000, "Validation")
#Add the time series deptTS
lines(deptTS, bty="l")
#Add the fitted and forecasted
lines(dept.hwfcst$fitted, lwd = 2, col = "blue")
lines(dept.hwfcst$mean, lwd = 2, col = "blue", lty = 2)
#Too fancy residuals plot
#plot(c(1, 7), c(-4000,6000), type="n", xlab="Year", ylab="Holt-Winters Residuals", bty="l", xaxt="n", yaxt="n")
# Add the x-axis
#axis(1, at=seq(1,7,1), labels=format(seq(1,7,1)))
# Add the y-axis
#axis(2, at=seq(-4000, 6000, 1000), labels=format(seq(-4000, 6000, 1000)), las=0)
#abline(v=6)
#arrows(1, 5500, 6, 5500, code=3, length=0.1)
#text(3, 6000, "Training")
#abline(v=7)
#arrows(6, 5500, 7, 5500, code=3, length=0.1)
#text(6.5, 6000, "Validation")
#lines(dept.hwfcst$residuals, lwd = 2, col = "violet")
#Plot the residual
plot(dept.hwfcst$residuals)
The Holt-Winters approach makes sense for this time series. We’re seeing small residuals, and the plots look like a high-quality forecast fit. Importantly, the method fits the base series. Where we visually identified both trend and seasonality in the base series, we know Holt-Winters is an appropriate method to attempt for quarters 21 and 22.
Differencing: Where we will need to double-difference to remove both seasonality and trend, the order of operations won’t matter - we’d arrive at the same double-differenced series either way. Let’s demonstrate the method:
#create lag-1 series
difflag1.ts <- diff(deptTS, lag = 1)
difflag4.ts <- diff(deptTS, lag = 4)
difflag1_4.ts <-diff(diff(deptTS, lag = 4), lag = 1)
#Now plot the four time series'
# Set up the plot to have two rows and two columns
par(mfrow = c(3,2))
plot(deptTS, ylab="Sales", xlab="Year", main="Original Series")
plot (difflag1.ts, ylab="Lag 1", xlab="Year", main="Lag 1 to remove trend")
plot (difflag4.ts, ylab="Lag 4", xlab="Year", main="Lag 4 to remove quarterly seasonality")
plot (difflag1_4.ts, ylab="Lag 1 and 4", xlab="Year", main="Double Differenced Series")
#Plot the double difference in the opposite order, just to prove there's no difference:
plot((diff(diff(deptTS, lag = 1), lag = 4)), ylab="Lag 1 and 4", xlab="Year", main="Double Differenced Series opposite order")
#Set up training and validation periods on double diff
DDIFTrain <- window(difflag1_4.ts, end=c(1, trainLength))
DDIFValid <- window(difflag1_4.ts, start=c(1,trainLength+1), end=c(1,trainLength+validLength))
#plot to check my work
par(mfrow = c(1,2))
plot(DDIFTrain)
plot(DDIFValid)
par(mfrow = c(1,1))
#Create Double Differenced point forecasts based on the training period, for two periods.
pointForecasts <- meanf(DDIFTrain, h=4)
pointForecasts
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q2 569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q3 569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q4 569.2 -2116.935 3255.335 -3714.114 4852.514
#Convert this back: de-difference it
realForecasts <- vector()
for (i in 1:validLength) {
if(i == 1) {
realForecasts[i] <- pointForecasts$mean[i] + deptTrain[(trainLength+i)-validLength] + (deptTrain[trainLength] - deptTrain[trainLength - validLength])
} else {
realForecasts[i] <- pointForecasts$mean[i] + deptTrain[(trainLength+i)-validLength] + (realForecasts[i-1] - deptTrain[trainLength+i-1-validLength])
}
}
#print real forecast
realForecasts
## [1] 63982.2 68177.4 80201.6 101467.8
plot(realForecasts, type = 'l', bty = "l")
#Let's plot our double-differenced forecast against the actual time series:
plot(c(1, 7), c(48000,114500), type="n", xlab="Year", ylab="Quarterly Department Store Sales", bty="l", xaxt="n", yaxt="n")
# Add the x-axis
axis(1, at=seq(1,7,1), labels=format(seq(1,7,1)))
# Add the y-axis
axis(2, at=seq(48000, 114500, 9500), labels=format(seq(48000, 114500, 9500)), las=0)
abline(v=6)
arrows(1, 105000, 6, 105000, code=3, length=0.1)
text(3, 108000, "Training")
abline(v=7)
arrows(6, 105000, 7, 105000, code=3, length=0.1)
text(6.5, 108000, "Validation")
#Add the time series deptTS
lines(deptTS, bty="l")
lines(seq(6, 6.75, .25), realForecasts, col="blue", lwd=2, lty=2)
realForecasts[1:2]
## [1] 63982.2 68177.4
The plot above includes the forecasts we produced by double-differencing for all four quarters, but we did print the values for just quarters 21 and 22.
In comparing our results from the Holt Winters technique and the double-differencing technique, I would select Holt Winters for two reasons. First, Holt-Winters was simpler to execute, whereas the double-differencing was essentially a conversion process to enable us to apply a simpler static-series technique to a seasonal-trending series. Second, Holt-Winters seemed to produce forecasts with very strong predictive power in this case. In other words, it was both simpler and better.
As best practice, we should always compare against naive or seasonal naive. Let’s compute the seasonal naive values by taking the last four values of the training period.
deptSnaive <- deptTrain[17:20]
#Let's plot our seasonal naive forecast against the actual time series:
plot(c(1, 7), c(48000,114500), type="n", xlab="Year", ylab="Quarterly Department Store Sales", bty="l", xaxt="n", yaxt="n")
# Add the x-axis
axis(1, at=seq(1,7,1), labels=format(seq(1,7,1)))
# Add the y-axis
axis(2, at=seq(48000, 114500, 9500), labels=format(seq(48000, 114500, 9500)), las=0)
abline(v=6)
arrows(1, 105000, 6, 105000, code=3, length=0.1)
text(3, 108000, "Training")
abline(v=7)
arrows(6, 105000, 7, 105000, code=3, length=0.1)
text(6.5, 108000, "Validation")
#Add the time series deptTS
lines(deptTS, bty="l")
#Add the seasonal naive forecast
lines(seq(6, 6.75, .25), deptSnaive, col="red", lwd=2, lty=2)
As expected, the seasonal naive forecast offers a decent baseline. It’s not as strongly predictive as the Holt-Winters approach, but much simpler to compute. Depending on what we’re doing (aka, domain knowledge), seasonal naive may be “good enough” for the task at hand.
If I were choosing a single smoothing method for all the series, I would select Holt-Winters, because at least on of the series exhibits both seasonality and trend (see Fortified Wine Sales).
To begin, we’ll need to read-in the data, establish training and validation periods, and apply our selected forecast method.
wine <- read.csv("AussieWine.csv", header = TRUE, stringsAsFactors = FALSE)
wineTS <-ts(wine$Fortified, start=c(1), frequency=12)
#raw plot to check
#plot(wineTS)
validLength <- 12
# Set the length of the training period to everything else
trainLength <- length(wineTS) - validLength
# Partition the data into training and validation periods
wineTrain <- window(wineTS, end=c(1, trainLength))
wineValid <- window(wineTS, start=c(1,trainLength+1), end=c(1,trainLength+validLength))
#raw plots to check window
#plot(wineTrain)
#plot(wineValid)
#Create model and forecast
wine.hw <- ets(wineTrain, model = "ZZM", alpha = 0.2, beta = 0.15, gamma = 0.15, restrict = FALSE, allow.multiplicative.trend = TRUE)
wine.hwfcst <- forecast(wine.hw, h=validLength, level=0)
plot(wine.hwfcst)
accuracy(wine.hwfcst$mean[1:2], wineValid[1:2])
## ME RMSE MAE MPE MAPE
## Test set -50.03725 101.2594 88.03257 -4.770643 7.193813
wine.hwfcst
## Point Forecast Lo 0 Hi 0
## Jan 15 1292.070 1292.070 1292.070
## Feb 15 1530.005 1530.005 1530.005
## Mar 15 1948.904 1948.904 1948.904
## Apr 15 2117.792 2117.792 2117.792
## May 15 2504.127 2504.127 2504.127
## Jun 15 2574.052 2574.052 2574.052
## Jul 15 3086.361 3086.361 3086.361
## Aug 15 2737.357 2737.357 2737.357
## Sep 15 2136.095 2136.095 2136.095
## Oct 15 1984.499 1984.499 1984.499
## Nov 15 2404.469 2404.469 2404.469
## Dec 15 2617.747 2617.747 2617.747
Printed above, we can see the Holt-Winters point forecasts generated by this method, along with its accuracy for the first two periods.
#wine.hwfcst$residuals
plot(wine.hwfcst$residuals)
The plot above shows the residuals over the training period for our Holt-Winters model. Regarding the four statements:
December (month 12) is not captured well by the model. This may be accurate, since we notice the largest residuals appear to have seasonality, occurring near month 12 throughout the training period.
There is a strong correlation between sales on the same calendar month. This statement is another way of saying, “there’s a seasonal pattern in the time series”. This statement would be better evaluated by looking at the original time series, rather than the Holt-Winters residuals.
We should first deaseasonalize the data… This assertion may be dismissed, as the selected method (Holt-Winters) is appropriate for a time series exhibiting seasonality.
If we believe there’s some truth to statements 1 and 3 from above (and our MAPE of 7.2% for the next two months suggests some room for improvement), the best idea would be to focus on seasonality adjustments within the model. One could experiment with different beta smoothing constants, or with additive rather than multiplicative seasonality. If we believe there may be multiple seasonalities within the series, we could explore even more advanced techniques, like double-seasonal Holt Winters Method (the dshw function).