Due Monday, February 26th, 2018 at 11:59PM: Problems 2, 5 and 8 from Chapter 5 of Shmueli
Assume that we apply a moving average to a series, using a very short window span.
To mimic an equivalent result from the moving average but using simple exponential smoothing, the smoothing constant would require a higher value — closer to 0.80. The reason is that in applying a moving average to a series with an incredibly short window span, the resulting forecast will show local trends. The greater alpha value means that more weight is placed on the most recent data, rather than a lower alpha value which would weight all data points more equally.
The file DeptStoreSales.csv contains data on the quarterly sales for a department store over a 6-year period.
- Moving average of a raw series
The moving average would not be suitable for the raw series as it is clear from the time plot provided in the textbook that there is seasonality, as well as what appears to be a slight trend with the data. While it may be implemented, the results from the moving average would prove to be unreliable as it is best suited for data with no trend or seasonality.
- Moving average of deseasonalized series
Despite deseasonalizing the series, the moving average would still return an unreliable forecast as it must contend with the trend that it is evident. Therefore, unless the data series was also detrended, the moving average of a deseasonalized series would not be suitable for forecasting this series.
- Simple exponential smoothing of the raw series
This method is not suited for forecasting this series. To use simple exponential smoothing to the raw series, one would have to deseasonalize and detrend the data first as this method is better suited for data with no seasonality and no trend.
- Double exponential smoothing of the raw series
The Holt, or double exponential smoothing, method would not be suited for this data. While it may accommodate the trend that exists within the data, it can not account for the seasonality. So, in order to utilize the Holt method, the data would need to be deseasonalized beforehand.
- Holt-Winters’ exponential smoothing of the raw series
Because the Holt-Winters’, or triple exponential smoothing, method is able to handle both trend and seasaonlity it is therefore best suited for forecasting the department store sales data.
i. Run this method on the data. Request the forecasts on the validation period. (Note that the forecasted values for the validation set will be different than what the book shows.)
First lets take a look at the plotted data.
#Upload of the DeptStoreSales.csv file that was then partitioned for Training and Validation Periods
dept_sales <- read.csv("DeptStoreSales.csv")
#Divide Sales by 1,000 to increase autoplot legibility
dept_sales <- mutate(dept_sales, sales = (Sales/1000))
dept_sales <- select(dept_sales, "sales")
#Create time series
dept_ts <- ts(dept_sales[,1], start = c(1,1), frequency = 4)
#Plot Department Store Sales
yrange = range(dept_ts)
plot(c(1, 7), c(0, yrange[2]+10), type="n", main="Department Store Sales by Quarter", xlab="Year", ylab="$ Sales (000s)", xaxt="n", yaxt="n")
lines(dept_ts, lwd=2)
axis(1, at=seq(1, 7, 1), labels=format(seq(1,7,1)))
axis(2, at=seq(0, 110, 10),las=2)
#Establish the Training and Validation periods for dept_ts
nValid <- 4
nTrain <- length(dept_ts) - nValid
deptTRAIN <- window(dept_ts, start = c(1,1), end = c(1, nTrain))
deptVALID <- window(dept_ts, start = c(1, nTrain + 1), end = c(1, nTrain + nValid))
#Replot Department Store Sales with line distinctions for Training and Validation periods
plot(c(1, 7), c(0, yrange[2]+10), type="n", main="Department Store Sales by Quarter", xlab="Year", ylab="$ Sales (000s)", xaxt="n", yaxt="n")
lines(dept_ts, lwd=2)
axis(1, at=seq(1, 7, 1), labels=format(seq(1,7,1)))
axis(2, at=seq(0, 110, 10),las=2)
abline(v=6)
arrows(1, 105, 6, 105, code=3, length=0.1)
text(3, 108, "Training")
abline(v=7)
arrows(6, 105, 7, 105, code=3, length=0.1)
text(6.5, 108, "Validation")
#Display Training and Validation Period data
pander(deptTRAIN, caption = "Training Period", split.table = Inf, row.names = paste('Year', 1:5))
| Q1 | Q2 | Q3 | Q4 | |
|---|---|---|---|---|
| Year 1 | 50.15 | 49.33 | 57.05 | 76.78 |
| Year 2 | 48.62 | 50.9 | 58.52 | 77.69 |
| Year 3 | 50.86 | 53.03 | 58.85 | 79.66 |
| Year 4 | 51.64 | 54.12 | 65.68 | 85.17 |
| Year 5 | 56.41 | 60.03 | 71.49 | 92.18 |
pander(deptVALID, caption = "Validation Period", split.table = Inf, row.names = "Year 6")
| Q1 | Q2 | Q3 | Q4 | |
|---|---|---|---|---|
| Year 6 | 60.8 | 64.9 | 77 | 103.3 |
#Call the hw() method
deptHWmult <- hw(deptTRAIN, seasonal="multiplicative", h=4)
autoplot(deptTRAIN, main = "Holt-Winters Forecast on Department Store Sales by Quarter", ylim = c(0,110)) + xlab("Year") + ylab("$ Sales (000s)") + autolayer(deptHWmult, PI=FALSE, series="Holt-Winters Multiplicative") + autolayer(deptVALID, series="Validation Data")
summary(deptHWmult)
##
## Forecast method: Holt-Winters' multiplicative method
##
## Model Information:
## Holt-Winters' multiplicative method
##
## Call:
## hw(y = deptTRAIN, h = 4, seasonal = "multiplicative")
##
## Smoothing parameters:
## alpha = 0.3845
## beta = 0.155
## gamma = 0.4869
##
## Initial states:
## l = 58.0694
## b = 0.0504
## s=1.3101 0.9798 0.857 0.8531
##
## sigma: 0.0252
##
## AIC AICc BIC
## 95.02871 113.02871 103.99030
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.3902652 1.470429 0.9580152 0.6432333 1.611164 0.307401
## ACF1
## Training set -0.1212042
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 61.34083 59.35612 63.32554 58.30547 64.37619
## 6 Q2 65.01693 62.63655 67.39732 61.37645 68.65742
## 6 Q3 76.88570 73.61530 80.15610 71.88405 81.88735
## 6 Q4 99.53246 94.56957 104.49534 91.94238 107.12253
ii. Using the forecasts for the validation set that you came up with in i. above, compute the MAPE values for the forecasts of quarters 21-22.
#####Table 1.1 - Table w/ MAPE value for Q21#####
#Computing the MAPE values for Q21 from the Validation set
accuracy(deptHWmult, deptVALID, test = 1)
## ME RMSE MAE MPE MAPE MASE
## Test set -0.5408292 0.5408292 0.5408292 -0.8895218 0.8895218 0.1735374
## ACF1 Theil's U
## Test set NA NaN
#Computing the MAPE value for Q22 from the Validation set
accuracy(deptHWmult, deptVALID, test = 2)
## ME RMSE MAE MPE MAPE MASE ACF1
## Test set -0.1169336 0.1169336 0.1169336 -0.180175 0.180175 0.03752081 NA
## Theil's U
## Test set 0.02852039
#Reproduce the fit, as displayed in the book
autoplot(deptTRAIN, main = "Exp Smoothing: Actual Vs Forecast", series="Actual", ylim = c(0,110)) + xlab("Year") + ylab("$ Sales (000s)") + autolayer(deptHWmult$fitted, series="Forecast")
#Reproduce the residuals, as displayed in the book
autoplot(resid(deptHWmult), main = "Exp Smoothing Forecast Errors", xlab = "Year", ylab = "Forecast Error")
This model proves to be suitable for forecasting Q21 and Q22 based on several factors, including the low MAPE values and the model’s strong ability to fit to the data set.
Part 1. Remove the trend and seasonal pattern.
dept_dtrend1 <- diff(dept_ts, lag=1)
dept_dseas1 <- diff(dept_dtrend1, lag=4)
par(mfrow = c(2,1))
plot(dept_dtrend1, main = "Department Store Sales by Quarter, Detrended", xlab="Year", ylab=NULL, yaxt='n')
plot(dept_dseas1, main = "Department Store Sales by Quarter, Detrended THEN Deseasonalized", xlab="Year", ylab=NULL, yaxt='n')
Part 2. Remove seasonal pattern and then the trend.
dept_dseas2 <- diff(dept_ts, lag=4)
dept_dtrend2 <- diff(dept_dseas2, lag=1)
par(mfrow = c(2,1))
plot(dept_dseas2, main = "Department Store Sales by Quarter, Deseasonalized", xlab="Year", ylab=NULL, yaxt='n')
plot(dept_dtrend2, main = "Department Store Sales by Quarter, Deseasonalized THEN Detrended", xlab="Year", ylab=NULL, yaxt='n')
Part 3. Comparison of final products, as evidence to support claim.
Whether you deseasonalize first and then detrend, or vice versa, the end product returns similar results, as shown below following the processes from both Parts 1 & 2 above.
par(mfrow = c(2,1))
plot(dept_dseas1, main = "Department Store Sales by Quarter, Detrended THEN Deseasonalized", xlab="Year", ylab=NULL, yaxt='n')
plot(dept_dtrend2, main = "Department Store Sales by Quarter, Deseasonalized THEN Detrended", xlab="Year", ylab=NULL, yaxt='n')
# The meanf() function will return a point forecast (and prediction interval) for the stationary process. H=2 represents the parameter to forecast out 2 periods, Q21 and Q22.
pointForecasts <- meanf(diff(diff(deptTRAIN, lag=4), lag=1), h=2)
# Set up an empty vector
realForecasts <- vector()
for (i in 1:nValid) {
if(i == 1) {
realForecasts[i] <- pointForecasts$mean[i] + deptTRAIN[(nTrain+i)-nValid] + (deptTRAIN[nTrain] - deptTRAIN[nTrain - nValid])
} else {
realForecasts[i] <- pointForecasts$mean[i] + deptTRAIN[(nTrain+i)-nValid] + (realForecasts[i-1] - deptTRAIN[nTrain+i-1-nValid])
}
}
# See what they look like
realForecasts
## [1] 63.9822 68.1774 NA NA
From Holt-Winters’:
accuracy(deptHWmult, deptVALID, test=1:2)
## ME RMSE MAE MPE MAPE MASE
## Test set -0.3288814 0.3912606 0.3288814 -0.5348484 0.5348484 0.1055291
## ACF1 Theil's U
## Test set -0.5 0.02852039
From Double-Differencing:
accuracy(realForecasts, deptVALID)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set -3.2298 3.230151 3.2298 -5.141902 5.141902 -0.5 0.13634
Based on the lower MAPE scores for the Holt-Winters’ model, when compared to those of Double-Differencing, (along with the fit results of plotting against the Validation set) I would choose the Holt-Winters’ forecasting. Not only was it easier to compute, it was also easier to immediately interpret and understand, and easy to visualize.
dept_snaive <- snaive(deptTRAIN,h=nValid)
plot(c(1, 7), c(0, yrange[2]+10), type="n", main="Department Store Sales by Quarter", xlab="Year", ylab="$ Sales (000s)", xaxt="n", yaxt="n")
lines(dept_ts, lwd=2)
axis(1, at=seq(1, 7, 1), labels=format(seq(1,7,1)))
axis(2, at=seq(0, 110, 10),las=2)
lines(deptHWmult$mean,col="red", lwd=2)
lines(dept_snaive$mean, col="orange", lwd=2)
legend("bottomright", c("Actual","Holt-Winters'", "Seasonal Naive"),bty="n", lwd=2, col=c("black", "red","orange"))
An even simpler approach would be looking at the seasonal naive forecast method. It is clear that there is seasonality to the data, which would lend itself well to the ease of running the seasonal naive. However, while it may be simpler, it is important to note from the plotted chart above that the Seasonal Naive is consistently underforecasting all of Year 6. In this particular instance, the Holt-Winters’ model fits the Validation period much better. However, one should also look at the MAPE values to confirm/deny the assumption. (side note: based on the MAPE values below, the Holt-Winter can be confirmed as a better model)
From Seasonal Naive:
accuracy(dept_snaive, deptVALID)
## ME RMSE MAE MPE MAPE MASE
## Training set 2.92525 3.878784 3.11650 4.344358 4.737739 1.000000
## Test set 6.48225 7.032176 6.48225 8.170540 8.170540 2.079978
## ACF1 Theil's U
## Training set 0.54855812 NA
## Test set 0.01334402 0.4705349
From Holt-Winters’:
accuracy(deptHWmult, deptVALID)
## ME RMSE MAE MPE MAPE MASE
## Training set 0.3902652 1.470429 0.9580152 0.6432333 1.611164 0.3074010
## Test set 0.8145199 1.923091 1.1434013 0.6891346 1.223983 0.3668864
## ACF1 Theil's U
## Training set -0.12120425 NA
## Test set -0.01525507 0.1251457
You are hired to obtain short-term forecasts (2-3 months ahead) for each of the six series (red, rose, sweet white, dry white, sparkling, and fortified), and this task will be repeated every month.
On first glance it is apparent that all six series have some seasonality to them. Some of the series have distinct trends to them (including rose, red, sparkling, dry white, and fortified), while the sweet wine has less distinct seasonality … particularly up until (what appears to be a massive surge in consumption in) 1986. Given that we must select a forecasting method for all six series and it must be repeated on a monthly basis, then I would be forced to select a method that can accommodate seasonality and trend. This would rule out Moving Averages and Simple Exponential Smoothing, as well as the Holt method — which does not lend itself well to account for seasonality.
#Plot the residuals from HW
autoplot(resid(wineHWmult), main = "Residuals from Holt-Winters' multiplicative method", xlab = "Year", ylab = "Forecast Error")
checkresiduals(wineHWmult)
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method
## Q* = 39.543, df = 8, p-value = 3.897e-06
##
## Model df: 16. Total lags used: 24
i. Based on this plot, which of the following statements are reasonable?
- Decembers (month 12) are not captured well by the model.
This statement is reasonable based on the drastic fluctuation on December residuals. - There is a strong correlation between sales on the same calendar month.
While there is seasonality to sales of Fortified Wine, I’m hesitant to declare that this is entirely reasonable. Considering the flucuations in residuals, it’s a little unclear. Especially given that there are some months that have a strong correlation between them. - The model does not capture the seasonality well
This statement doesn’t feel too reasonable. While perhaps not the most effective at capturing seasonality based on what we’ve seen in examples past, the Holt-Winters’ method is intended to be able to capture both trend and seasonality. Both of which are present in the Fortified Wine data set. - We should first deseasonalize the data and then apply Holt-Winters’ exponential smoothing.
Because Holt-Winters’ is equipped to capture both trend and seasonality then this statement feels unreasonable.
ii. How can you handle the above effect with exponential smoothing?
We are looking at month to month seasonality, but I wonder whether perhaps we extend that seasonality to explore annual seasonality. Given the impact that external factors may have on wine (i.e., temperature, vineyard conditions, etc.) it feels that an annual batch may skew the data set and potentially provide misleading results. I wonder how many times these participating Australian vineyards heard that their 1991 was a bottle of crap?