knitr::opts_chunk$set(echo = TRUE)
Assume that we apply a moving average to a series, using a very short window span. If we wanted to achieve an equivalent result using simple exponential smoothing, what value would the smoothing constant take?
In section 5.4, the formula for determining the moving average window (w) from a simple exponential smoothing alpha is:
w = 2/alpha - 1
Solving for alpha gives the formula:
alpha = 2/(w+1)
It should also be noted that as the moving average window, w, gets smaller, the exponential smoothing value alpha gets closer to 1. Conversely, as the exponential smoothing value of alpha gets closer to 0, the moving average window, w, gets larger.
The file DepartmentStoreSales.xls contains data on the quarterly sales for a department store over a 6-year period.
# Load libraries and set environment options
library(dplyr)
library(tidyr)
library(knitr)
library(readxl)
library(forecast)
library(ggplot2)
library(zoo)
# Use this option to supress scientific notation in printing values
options(scipen = 10, digits = 2)
a. Which of the following methods would not be suitable for forecasting this series. Explain why or why not for each one.
We start by decomposing the time series to get a better understand the data.
First, plot the time series, then run decompose.
# Decompose and plot the time series for initial analysis
dec_DSSales <- decompose(DSSales.ts)
yrange=range(DSSales.ts)
plot(c(1,6), yrange, type="n", xlab="YEAR", ylab="Sales in $10K", bty="l", yaxt="n", main = "Department Store Sales")
lines(DSSales.ts, bty="l")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))
axis(2, at=seq(50000, 110000, 20000), labels=format(seq(50, 110,20)), las=2)
#autoplot(DSSales.ts, bty="l")
autoplot(dec_DSSales)
Decomposition shows that the time series is additive, seasonal, and has an upward trend.
Now, consider each of the suggested forecasting methods:
Moving average of raw series
Moving average would not be a good choice for forecasting since it does not forecast seasonality. It would show the trend, but not seasonality.
Moving average of de-seasonalized series
As stated for the moving average of the raw series, de-seasonalizing the series would be better for showing the trend, but seasonality would be lost.
Simple exponential smoothing of the raw series
Simple exponential smoothing does not work with trends or seasonality, so it would not be a good choice.
Double exponential smoothing of the raw series
Double exponential smoothing, also known as Holt’s linear trend model, would better handle the upward trend, but not the seasonality.
Holt-Winter’s exponential smoothing of the raw series
Holt-Winter’s exponential smoothing would be the best choice of all those listed here. It can handle both trend and seasonality.
b. A forecaster was tasked to generate forecasts for 4 quarters ahead. She therefore partitioned the data so that the last 4 quarters were designated as the validation period. The forecaster approached the forecasting task by using multiplicative Holt-Winter’s exponential smoothing. Specifically, you should call the hw function with the parameter seasonal=“multiplicative”. Let the method pick the appropriate parameters for alpha, beta, and gamma.
i. Run this method on the data. Request the forecasts on the validation period. (Note that the forecasted values for the validation set will be different than what the book shows.)
ii. Using the forecasts for the validation set that you came up with in i. above, compute the MAPE values for the forecasts of quarters 21-22.
Quarter Actual Forecast Error MAPE
21 60800 61335 -535 .88
22 64900 64971 -71 .11
# Create Training files, then run naive, seasonal naive, and mean forecasts
valid_length<-4
train_length<-length(DSSales.ts)-valid_length
DSSales_train<-window(DSSales.ts, start=c(1,1), end=c(5,4))
DSSales_valid <- window(DSSales.ts, start=c(6,1), end=c(6,4))
DSSales_HWfc <- hw(DSSales_train, seasonal = "multiplicative", h=4)
summary(DSSales_HWfc)
##
## Forecast method: Holt-Winters' multiplicative method
##
## Model Information:
## Holt-Winters' multiplicative method
##
## Call:
## hw(y = DSSales_train, h = 4, seasonal = "multiplicative")
##
## Smoothing parameters:
## alpha = 0.4032
## beta = 0.1429
## gamma = 0.4549
##
## Initial states:
## l = 57401.8119
## b = 605.4045
## s=1.3 0.98 0.86 0.86
##
## sigma: 0.026
##
## AIC AICc BIC
## 372 390 381
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 247 1500 978 0.35 1.7 0.31 -0.079
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 61335 59303 63367 58228 64442
## 6 Q2 64971 62529 67413 61237 68706
## 6 Q3 76718 73377 80059 71608 81828
## 6 Q4 99421 94372 104469 91700 107141
#head(DSSales_valid)
print("Accuracy from Holt-Winter's Forecast")
## [1] "Accuracy from Holt-Winter's Forecast"
accuracy(DSSales_HWfc, DSSales_valid)
## ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
## Training set 247 1500 978 0.35 1.7 0.31 -0.0788 NA
## Test set 897 1982 1200 0.79 1.3 0.39 0.0095 0.13
c. The fit and the residuals were displayed in the book. Please reproduce them with R code. Using all the information from (b) and your generated figures, is this model suitable for forecasting quarters 21 and 22?
# Plot the forecasts
autoplot(DSSales_train, xlab="Year - Quarter", ylab="Sales in $10K") + autolayer(DSSales_HWfc$fitted, series = "Fitted") + autolayer(DSSales_valid, series = "Validation") + autolayer(DSSales_HWfc, series = "Forecast")
### Holt-Winter’s forecast residuals
# Plot the residuals
plot(DSSales_HWfc$residuals, col=2, lwd=2, bty="l",xant="n", xlab="Year - Quarter", ylab="Forecast Error", yaxt="n")
axis(2, las=2)
d. Another analyst decided to take a much simpler approach, and instead of using exponential smoothing he used differencing. Use differencing to remove the trend and seasonal pattern. Which order works better: first removing trend and then seasonality or the opposite order? Show the progression of time plots as you difference the data and each final series to provide evidence in support of your answer.
# Use Differencing method
# Set up the plot window (2 x 2)
par(mfrow = c(2,3))
# Plot the original time series
plot(DSSales.ts, xlab="Year - Quarter", xant="n", ylab="Sales in $10K", yaxt="n", bty="l", main="Department Store Sales")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))
axis(2, at=seq(50000, 110000, 20000), labels=format(seq(50, 110,20)), las=2)
# Plot the lag-4 time series
plot(diff(DSSales.ts, lag=4), xlab="Year - Quarter", xant="n", ylab="LAG - 4", bty="l", main="Lag - 4 Difference")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))
# Plot the lag-1 difference
plot(diff(DSSales.ts, lag=1), xlab="Year - Quarter", xant="n", ylab="LAG - 4", bty="l", main="Lag - 1 Difference")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))
# Plot the lag-4, then lag-1 difference
DSSales_lag41 <- diff(diff(DSSales.ts, lag=4), lag=1)
plot(DSSales_lag41, xlab="Year - Quarter", xant="n", ylab="lag 4, then 1", bty="l", main="Lag 4, then Lag 1")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))
# Plot the lag 1, then lag 4 difference
DSSales_lag14 <- diff(diff(DSSales.ts, lag=1), lag=4)
plot(DSSales_lag14, xlab="Year - Quarter", xant="n", ylab="lag 1, then 4", bty="l", main="Lag 1, then Lag 4")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))
Examining the charts for lag 4, then lag 1 and lag 1, then lag 4, they are identical. It apears there is no difference in sequence.
e. Forecast quarters 21-22 using the average of the double-differenced series from (d). Remember to use only the training period (until quarter 20), and to adjust back for the trend and seasonal pattern.
# Create a forecast using the double differenced data
valid_length<-2
old_valid_length<-4
train_length<-length(DSSales.ts)-old_valid_length
DSSales_train<-window(DSSales.ts, start=c(1,1), end=c(5,4))
DSSales_valid2 <- window(DSSales.ts, start=c(6,1), end=c(6,2))
DSSales_dfc <- meanf(diff(diff(DSSales_train, lag=4), lag=1), h=2)
print("Differenced forecast for quartes 21 & 22")
## [1] "Differenced forecast for quartes 21 & 22"
DSSales_dfc
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 569 -2117 3255 -3714 4853
## 6 Q2 569 -2117 3255 -3714 4853
f. Compare the forecasts from (e) to the exponential smoothing forecasts found in (b). Which of the two forecasting methods would you choose? Explain.
# De-difference the forecast
ddif_DSSales_dfc <-vector()
for (i in 1:valid_length) {
if(i == 1) {
ddif_DSSales_dfc[i] <- DSSales_dfc$mean[i] + DSSales_train[(train_length+i)-old_valid_length] + (DSSales_train[train_length] - DSSales_train[train_length - old_valid_length])
}
else {
ddif_DSSales_dfc[i] <- DSSales_dfc$mean[i] + DSSales_train[(train_length+i)-old_valid_length] + (ddif_DSSales_dfc[i-1] - DSSales_train[train_length+i-1-old_valid_length])
}
}
#print("De-differenced forecast for quartes 21 & 22")
ddif_DSSales_dfc
## [1] 63982 68177
print("Validation set for quartes 21 & 22")
## [1] "Validation set for quartes 21 & 22"
DSSales_valid2
## Qtr1 Qtr2
## 6 60800 64900
print("Check accuracy of De-differenced forecast")
## [1] "Check accuracy of De-differenced forecast"
accuracy(ddif_DSSales_dfc, DSSales_valid2)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set -3230 3230 3230 -5.1 5.1 -0.5 0.8
The De-differenced forecast is NOT a good model for this time series.
g. What is an even simpler approach that should be compared as a baseline? Complete that comparison.
An even simler approach would be to use the ets function Ets will compute the correct model based on its determination of trend and seasonality.
# Create a histogram of the forecast errors
DSSales_ETS <- ets(DSSales_train)
summary(DSSales_ETS)
## ETS(M,N,M)
##
## Call:
## ets(y = DSSales_train)
##
## Smoothing parameters:
## alpha = 0.9907
## gamma = 0.0001
##
## Initial states:
## l = 59697.8836
## s=1.3 1 0.86 0.84
##
## sigma: 0.025
##
## AIC AICc BIC
## 368 377 375
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 546 1597 1338 0.82 2.1 0.43 -0.23
DSSales_ETSFC <- forecast(DSSales_ETS, h=4)
DSSales_ETSFC
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 59250 57310 61189 56284 62216
## 6 Q2 60933 58125 63741 56638 65227
## 6 Q3 70335 66370 74299 64272 76398
## 6 Q4 92197 86200 98194 83026 101368
accuracy(DSSales_ETSFC, DSSales_valid)
## ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
## Training set 546 1597 1338 0.82 2.1 0.43 -0.23 NA
## Test set 5830 6831 5830 7.02 7.0 1.87 0.21 0.48
Not as good a forecast as the double exponential smoothing.
Figure 5.14 shows time plots of monthly sales of six types of Australian wines (red, rose, sweet white, dry white, sparkling, and fortified) for 1980-1994. Data available in AustralianWines.xls. The units are thousands of liters. You are hired to obtain short-term forecasts (2-3 months ahead) for each of the six series, and this task will be repeated every month.
a) Which smoothing method would you choose if you had to choose the same method for forecasting all series? Why?
Given that forecasting needs to be done on products that although similar (all wines), each could have very different trends and seasonality. If the forecasting has to be repeated every month, and a single method must be used for all, I would use the ETS method. The ETS function selects the “best fit” forecast model for the given data. Then the forecast can be run using that model.
b) Fortified wine has the largest market share of the six types of wine. You are asked to focus on fortified wine sales alone and produce as accurate a forecast as possible for the next two months.
i. Start by partitioning the data using the period until Dec - 1993 as the training period.
ii. Apply Hold-Winter’s exponential smoothing (with multiplicative seasonality) to sales.
First, create the fortified wine sales time series, and decompose the time series to get an initial look at the data.
# Create a Time Series and Line Model of the data
WineSales.ts<-ts(WineSales$Fortified, start=c(1980,1),end=c(1984,4), frequency = 12)
#WineSales_lm<-tslm(WineSales.ts ~ trend+I(trend^2))
# Decompose and plot the time series for initial analysis
dec_WineSales <- decompose(WineSales.ts)
yrange=range(WineSales.ts)
plot(c(1980, 1984), yrange, type="n", xlab="YEAR", xaxt="n", ylab="Sales in Thousans of Liters", bty="l", yaxt="n")
lines(WineSales.ts, bty="l", main="Fortified Wine Sales")
axis(1, at=seq(1980,1984, 1), labels=format(seq(1980,1984, 1)))
axis(2, at=seq(1500, 6000, 500), las=2)
autoplot(dec_WineSales)
# Create Training files and validation files
valid_length<-4
train_length<-length(WineSales.ts)-valid_length
WineSales_train<-window(WineSales.ts, start=c(1080,1), end=c(1983,12))
WineSales_valid <- window(WineSales.ts, start=c(1984,1), end=c(1984,4))
#Apply Hold-Winter's exponential smoothing (with multiplicative seasonality) to sales
WineSales_HWfc <- hw(WineSales_train, seasonal = "multiplicative", h=4)
summary(WineSales_HWfc)
##
## Forecast method: Holt-Winters' multiplicative method
##
## Model Information:
## Holt-Winters' multiplicative method
##
## Call:
## hw(y = WineSales_train, h = 4, seasonal = "multiplicative")
##
## Smoothing parameters:
## alpha = 0.0374
## beta = 0.0001
## gamma = 0.0001
##
## Initial states:
## l = 3905.7063
## b = -7.0187
## s=0.99 1 0.9 1 1.4 1.4
## 1.2 1.1 0.93 0.82 0.75 0.62
##
## sigma: 0.072
##
## AIC AICc BIC
## 754 774 786
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -8.4 279 218 -0.74 5.9 0.61 0.2
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 1984 2210 2007 2413 1899 2520
## Feb 1984 2668 2422 2913 2292 3043
## Mar 1984 2896 2629 3162 2488 3303
## Apr 1984 3261 2960 3561 2801 3720
head(WineSales_valid)
## Jan Feb Mar Apr
## 1984 1954 2604 3626 2836
accuracy(WineSales_HWfc, WineSales_valid)
## ME RMSE MAE MPE MAPE MASE ACF1 Theil's U
## Training set -8.4 279 218 -0.74 5.9 0.61 0.20 NA
## Test set -3.4 443 369 -2.59 12.7 1.03 -0.43 0.55
c) Create a plot for the residuals from the Hold-Winter’s exponential smoothing.
# Plot the forecasts
autoplot(WineSales_train, xlab="Year", ylab="Sales in Thousans of Liters") + autolayer(WineSales_HWfc$fitted, series = "Fitted") + autolayer(WineSales_valid, series = "Validation") + autolayer(WineSales_HWfc, series = "Forecast")
# Plot the residuals
plot(WineSales_HWfc$residuals, col=2, lwd=2, bty="l",xant="n", xlab="Year", ylab="Forecast Error", yaxt="n")
axis(2, las=2)
i. Based on this plot, which of the following statements are reasonable? . Decembers (month 12) are not captured well by the model. . There is a strong correlation between sales on the same calendar month. . The model does not capture the seasonality well. . We should first deseasonalize the data and then apply Hold-Winter’s exponential smoothing.
I would agree with the statements: - There is a strong correlation between sales in the same calendar month
- The model does not handle seasonality well
ii. How can you handle the above effect with exponential smoothing?
A couple of options, use diff and de-diff to create a forecast without seasonality and then add seasonality back in. Or, try using ETS.