Problem 2: Relationship between Moving Average and Exponential Smoothing To achieve an equivalent result using simple exponential smoothing the smoothing constant should be equal to 2 over the window plus one:
\[\alpha = \frac{2}{w+1}\] The MA with window width w is approximately equal when this is true. With a very short window span we would want the smoothing constant to be close to 1. The closer alpha is to 1, the less impact past values have on the forecast.
Problem 5: Forecasting Department Store Sales
# Read in Department Store Sales file
StoreSales <- read.csv("DeptStoreSales.csv", stringsAsFactors = FALSE)
# Shows structure of StoreSales data frame
str(StoreSales)
## 'data.frame': 24 obs. of 2 variables:
## $ Quarter: int 1 2 3 4 5 6 7 8 9 10 ...
## $ Sales : int 50147 49325 57048 76781 48617 50898 58517 77691 50862 53028 ...
#Create time series object
SalesTS <- ts(StoreSales$Sales, start = c(1), frequency = 1)
salests <- ts(StoreSales$Sales, start = c(1), frequency = 4)
# Regenerate plot of time series object SalesTS shown in Figure 5.10
par(oma = c(0,1,0,0))
at_tick <- seq_len(length(SalesTS + 1))
plot((1:24), ylim = c(0,120000), xlim = c(1,24), type = "n", xlab = "Quarter", ylab = "Sales ($)", bty = "l", xaxt = "n", yaxt = "n", font.lab = 2)
points(SalesTS, pch = 19)
axis(2, at = seq(0, 120000, 20000), labels = format(seq(0, 120000, 20000)), las = 2, cex.axis = 0.5)
lines(SalesTS, bty = "l")
axis(1, at = at_tick, labels = FALSE)
axis(1, at = seq_along(SalesTS) - 0.5, tick = FALSE, labels = (1:24), cex.axis = 0.5)
# Create plot of time series object SalesTS
plot(c(1,24), ylim = c(40000,110000), xlim = c(1,24), type = "n", xlab = "Quarter", ylab = "Sales (in thousands)", bty = "l", xaxt = "n", yaxt = "n")
axis(1, at = seq(1,24,1), labels = format(seq(1,24,1)), cex.axis = 0.75)
axis(2, at = seq(40000, 110000, 10000), labels = format(seq(40, 110, 10)), las = 2, cex.axis = 0.75)
lines(SalesTS, bty = "l")
title("Department Store Quarterly Sales")
# Create year time series plot
plot(c(1,7), ylim = c(40000,120000), xlim = c(1,7), type = "n", xlab = "Year", ylab = "Sales (in thousands)", bty = "l", xaxt = "n", yaxt = "n")
box()
box(which = "plot")
minor.tick(nx = 3, ny = 2, tick.ratio = 0.5)
lines(salests, bty = "l")
axis(1, at = seq(1,7,1), labels = format(seq(1,7,1)))
axis(2, at = seq(40000, 120000, 20000), labels = format(seq(40, 120, 20)), las = 2)
Moving average of raw series MA of the raw series is not suitable for forecasting because it does not take seasonality or trend into consideration.
Moving average of deseasonalized series The MA is used for forecasting series without seasonality and trend. The deseasonalized series would still have trend and the MA would “lag behind”, which would result in under-forecasting.
Simple exponential smoothing of the raw series
SES of the raw series is not suitable for forecasting this series. LIke MA, SES is also used for series without seasonality and trend.
Double exponential smoothing of the raw series
Holt’s linear trend model is not suitable for forecsating this series because it only takes trend into account.
Holt-Winter’s exponential smoothing of the raw series
Holt-Winter’s exponential smoothing of the raw series is suitable for forecasting because it accounts for seasonality and trend.
Although only the last method would be suitable, differencing at lag-M, where M = 4 followed by lag-1, would remove the trend and/or seasonality, making the other methods suitable for forecasting the series.
# Set up the training and validation period lengths, where the validation length is equal to 4 quarters
valid.Length <- 4
train.Length <- length(salests) - valid.Length
sales.Train <- window(salests, end = c(1,train.Length))
sales.Valid <- window(salests, start = c(1, train.Length + 1))
ses.Sales <- ets(sales.Train, model = "ZZZ", alpha = 0.2, beta = 0.15, gamma = 0.05)
plot(c(1,7), ylim = c(40000,120000), xlim = c(1,7), type = "n", xlab = "Year", ylab = "Sales (in thousands)", bty = "l", xaxt = "n", yaxt = "n")
axis(1, at = seq(1,7,1), labels = format(seq(1,7,1)), cex.axis = 0.75)
axis(2, at = seq(40000, 120000, 20000), labels = format(seq(40, 120, 20)), las = 2, cex.axis = 0.75)
lines(salests, bty = "l")
ses.SalesPredictions <- forecast(ses.Sales, h = valid.Length, level = 0)
ses.SalesPredictions
## Point Forecast Lo 0 Hi 0
## 6 Q1 62115.16 62115.16 62115.16
## 6 Q2 65371.60 65371.60 65371.60
## 6 Q3 77076.87 77076.87 77076.87
## 6 Q4 102937.73 102937.73 102937.73
lines(ses.SalesPredictions$mean, col = "red", lwd = 2, lty = 2)
abline(v=6)
arrows(1, 100000, 6, 100000, code = 3, length = 0.1)
text(3.5, 110000, "Training", cex = 0.75)
abline(v=7)
arrows(6, 100000, 7, 100000, code = 3, length = 0.1)
text(6.5, 110000, "Validation", cex = 0.75)
ses.SalesPredictions
## Point Forecast Lo 0 Hi 0
## 6 Q1 62115.16 62115.16 62115.16
## 6 Q2 65371.60 65371.60 65371.60
## 6 Q3 77076.87 77076.87 77076.87
## 6 Q4 102937.73 102937.73 102937.73
Q21_accuracy <- accuracy(ses.SalesPredictions$mean[1], sales.Valid[1])
Q22_accuracy <- accuracy(ses.SalesPredictions$mean[2], sales.Valid[2])
combined_accuracy <- accuracy(ses.SalesPredictions$mean[1:2], sales.Valid[1:2])
Q21_accuracy
## ME RMSE MAE MPE MAPE
## Test set -1315.156 1315.156 1315.156 -2.163086 2.163086
Q22_accuracy
## ME RMSE MAE MPE MAPE
## Test set -471.6041 471.6041 471.6041 -0.7266627 0.7266627
combined_accuracy
## ME RMSE MAE MPE MAPE
## Test set -893.3803 987.939 893.3803 -1.444875 1.444875
pointForecasts <- meanf(diff(diff(sales.Train, lag = 4), lag = 1), h=4)
realForecasts <- vector()
for (i in 1:valid.Length) {
if(i == 1) {
realForecasts[i] <- pointForecasts$mean[i] + sales.Train[(train.Length+i)-valid.Length] + (sales.Train[train.Length] - sales.Train[train.Length - valid.Length])
} else {
realForecasts[i] <- pointForecasts$mean[i] + sales.Train[(train.Length+i)-valid.Length] + (realForecasts[i-1] - sales.Train[train.Length+i-1-valid.Length])
}
}
yrange <- range(SalesTS)
par(oma = c(0,2,0,0))
plot(c(1,20), yrange, type = "n", xlab = "Quarter", ylab = "Sales ($)", bty = "l", xaxt = "n", yaxt = "n")
lines(sales.Train, bty = "l")
axis(1, at = seq(1,20,1), labels = format(seq(1,20,1)), cex.axis = 0.5)
axis(2, at = seq(0, 100000, 10000), labels = format(seq(0, 100000, 10000)), las = 2, cex.axis = 0.5)
lines(ses.Sales$fitted, col = "red", lty = 2, lwd = 2)
legend(1,100000, c("Actual", "Forecast"), lwd = 2, lty = c(1,2), col = c("black", "red"), bty = "n")
title("Exp Smoothing: Actual Vs Forecast (Training Data)")
plot(c(1,6),yrange,type="n",xlab="Years",ylab="Sales (thousands)",bty="l",xaxt="n",yaxt="n")
lines(sales.Train,bty="l")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))
axis(2, at=seq(50000,110000,10000), labels=format(seq(50,110,10)), las=2)
lines(ses.Sales$fitted,col="red",lty=2,lwd=2)
legend(1,100000, c("Actual", "Forecast"),lwd=2, lty=c(1,2), col=c("black", "red"), bty="n")
title("Exp Smoothing: Actual Vs Forecast (Training Data)")
plot(ses.SalesPredictions$residuals,xlab="Years",ylab="Forecast Error",bty="n",xaxt="n",yaxt="n")
axis(1,pos=c(0,0), at=seq(1,6,1), labels=format(seq(1,6,1)))
axis(2, at=seq(-0.04,0.06,0.02), labels=format(seq(-0.04,0.06,0.02)), las=2)
abline(h = 0)
title("Exp Smoothing Forecasting Errors (Training Data)")
The model produced by the Holt-Winter’s method is suitable for forecasting quarters 21-22 as evidenced by the low MAPE for Q21 & Q22.
Double difference by removing trend and then seasonal pattern
# Set up the plot to have two rows and two columns
par(mfrow=c(2,2))
# Plot original series
plot(salests,ylab="Sales", xlab="Years",bty="l",main="Department Store Sales")
# Double difference (trend then seasonality)
# Plot lag-1 difference (de-trended)
plot(diff(salests, lag = 1), ylab="Lag-1", xlab="Years", bty="l", main="Lag-1 Difference")
# Plot lag-4 difference (deseasonalized)
plot(diff(salests, lag = 4), ylab="Lag-4", xlab="Years", bty="l", main="Lag-4 Difference")
# Double difference the lag-1 and lag-4 (de-trended then deseasonalized)
lag1ThenLag4 <- diff(diff(salests, lag=1),lag=4)
# Plot the double difference
plot(lag1ThenLag4, ylab="Lag-1, then Lag-4", xlab="Years", bty="l", main="Twice-Differenced (Lag-1, Lag-4)")
Double difference by removing seasonal and then trend pattern
# Set up the plot to have two rows and two columns
par(mfrow=c(2,2))
# Plot original series
plot(salests,ylab="Sales", xlab="Years",bty="l",main="Department Store Sales")
# Double difference (trend then seasonality)
# Plot lag-4 difference (deseasonalized)
plot(diff(salests, lag = 4), ylab="Lag-4", xlab="Years", bty="l", main="Lag-4 Difference")
# Plot lag-1 difference (de-trended)
plot(diff(salests, lag = 1), ylab="Lag-1", xlab="Years", bty="l", main="Lag-1 Difference")
# Double difference the lag-4 and then lag-1 (deseasonalized and detrended)
lag4ThenLag1 <- diff(diff(salests, lag=4),lag=1)
# Plot the double difference
plot(lag4ThenLag1, ylab="Lag-4, then Lag-1", xlab="Years", bty="l", main="Twice-Differenced (Lag-4, Lag-1)")
Using the meanf() function will show the point forecast and prediction interval for the stationary process produced from double differencing.
pointForecasts <- meanf(diff(diff(sales.Train, lag=4), lag=1), h=4)
pointForecasts
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q2 569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q3 569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q4 569.2 -2116.935 3255.335 -3714.114 4852.514
Adjust back for the trend and seasonal pattern.
realForecasts <- vector()
for (i in 1:valid.Length) {
if(i == 1) {
realForecasts[i] <- pointForecasts$mean[i] + sales.Train[(train.Length+i)-valid.Length] + (sales.Train[train.Length] - sales.Train[train.Length - valid.Length])
} else {
realForecasts[i] <- pointForecasts$mean[i] + sales.Train[(train.Length+i)-valid.Length] + (realForecasts[i-1] - sales.Train[train.Length+i-1-valid.Length])
}
}
# plot
par(mfrow = c(1,1))
plot(realForecasts, type="l", bty="l")
realForecasts
## [1] 63982.2 68177.4 80201.6 101467.8
Plot the double-differenced forecasts against the actual
# Set the y range to help set up the plot
yrange = range(salests)
# Set up the plot
plot(c(1,7),yrange,type="n",xlab="Years",ylab="Sales (thousands)",bty="l",xaxt="n",yaxt="n")
# Add the x and y axes
axis(1, at=seq(1,7,1), labels=format(seq(1,7,1)))
axis(2, at=seq(40000,120000,20000), labels=format(seq(40,120,20)), las=2)
# Add the "visual breakpoints"
abline(v=6)
arrows(1, 100000, 6, 100000, code=3, length=0.1)
text(3.5, 102000, "Training")
abline(v=7)
arrows(6, 100000, 7, 100000, code=3, length=0.1)
text(6.5, 102000, "Validation")
# Add the time series for training set
lines(salests,bty="l")
# Add forecasts for validation period
lines(seq(6, 7-1/valid.Length, 1/valid.Length), realForecasts, col="red", lwd=2, lty=2)
# Add a legend to the plot
legend(1,100000, c("Actual", "Double Differenced"),lwd=2, lty=c(1,2), col=c("black", "red"), bty="n")
# Take a look at how good the forecasts are by calling the accuracy () function
accuracy(realForecasts, sales.Valid)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set -1948.75 2942.41 2883.35 -3.159236 4.063656 -0.07650809 0.1948234
The Holt-Winter’s method gives a lower MAPE, so I would use that instead of the double differenced method.
The trailing moving average is a simple approach that can be used as a baseline. The last value of the time series is the forecasted value for the entire validation period. The seasonal naive forecast is another simple approach that can be used as a baseline. Naive methods are often used as baselines because of their simplicity.
# Find the trailing moving average based on the training set
trailingMA <- rollmean(sales.Train, k=4, align = "right")
trailingMA.Forecast <- tail(trailingMA, 1)
trailingForecasts <- ts(rep(trailingMA.Forecast, valid.Length), start = c(1,train.Length + 1), freq = 4)
# Find the naive forecast
salesNaive <- snaive(sales.Train,h=valid.Length)
yrange = range(salests)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Sales $ (in thousands)", bty = "l", xaxt = "n", yaxt = "n")
# Add the original time series
lines(salests, bty = "l")
# Add x and y axes
axis(1, at=seq(1,7,1), labels=format(seq(1,7,1)))
axis(2, at=seq(40000,120000,20000), labels=format(seq(40,120,20)), las=2)
# Add the "visual breakpoints"
abline(v=6)
arrows(1, 95000, 6, 95000, code=3, length=0.1)
text(3.5, 100000, "Training", cex = 0.75)
abline(v=7)
arrows(6, 95000, 7, 95000, code=3, length=0.1)
text(6.5, 100000, "Validation", cex = 0.75)
# Add trailing moving average and seasonal naive forecasts
lines(trailingForecasts, col = "red", lty = 2)
lines(salesNaive$mean, col = "blue", lty = 3)
# Add legend
legend(1,94000, c("Actual", "Trailing MA", "Naive"),lwd=2, lty=c(1,2,3), col=c("black", "red", "blue"), bty="n", cex = 0.75)
accuracy(trailingForecasts, sales.Valid)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set 6482.25 17815.72 13658.5 4.553724 16.09045 0.1722954 1.147438
accuracy(salesNaive, sales.Valid)
## ME RMSE MAE MPE MAPE MASE
## Training set 2925.25 3878.784 3116.50 4.344358 4.737739 1.000000
## Test set 6482.25 7032.176 6482.25 8.170540 8.170540 2.079978
## ACF1 Theil's U
## Training set 0.54855812 NA
## Test set 0.01334402 0.4705349
Problem 8: Forecasting Australian Wine Sales
AusWines <- read.csv("AustralianWines.csv", stringsAsFactors = FALSE)
Which smoothing method would you choose if you had to choose the same method for forecasting all series? Why? Since each series is different and having trend and seasonality differs between them I would choose the Holt-Winter’s method, which accounts for seasonality and trend. If one or neither is present, the model can be adjusted to account for this (i.e. use of smoothing constants alph, beta and gamma).
Produce as accurate a forecast as possible for the next two months for fortified wine sales.
fortifiedTS <- ts(AusWines$Fortified, start=c(1980,1), frequency=12)
yrange <- range(fortifiedTS)
xrange <- range(AusWines$Month)
plot(fortifiedTS, xlab = "Year", ylab = "Fortified Wine Sales")
valid.Length <- 12
train.Length <- length(fortifiedTS) - valid.Length
fortified.Train <- window(fortifiedTS,end=c(1980,train.Length))
fortified.Valid <- window(fortifiedTS,start=c(1980,train.Length+1),end=c(1980,train.Length+valid.Length))
sesFortified <- ets(fortified.Train,model="ZZZ",restrict=FALSE, allow.multiplicative.trend = TRUE)
sesFortified
## ETS(M,M,M)
##
## Call:
## ets(y = fortified.Train, model = "ZZZ", restrict = FALSE, allow.multiplicative.trend = TRUE)
##
## Smoothing parameters:
## alpha = 0.0203
## beta = 1e-04
## gamma = 1e-04
##
## Initial states:
## l = 4039.3422
## b = 0.9965
## s=1.1306 1.0327 0.8852 0.9611 1.2728 1.3569
## 1.1372 1.1133 0.9631 0.862 0.6926 0.5926
##
## sigma: 0.0876
##
## AIC AICc BIC
## 2892.668 2896.541 2946.566
sesFortifiedOpt <- ets(fortified.Train, model = "MMM", alpha = 0.02)
sesFortifiedOpt
## ETS(M,M,M)
##
## Call:
## ets(y = fortified.Train, model = "MMM", alpha = 0.02)
##
## Smoothing parameters:
## alpha = 0.02
## beta = 1e-04
## gamma = 1e-04
##
## Initial states:
## l = 4081.2317
## b = 0.9964
## s=1.1253 1.044 0.8885 0.9475 1.2629 1.3737
## 1.1428 1.118 0.9565 0.8577 0.699 0.5841
##
## sigma: 0.0872
##
## AIC AICc BIC
## 2889.229 2892.650 2939.957
sesFortifiedPredictions <- forecast(sesFortifiedOpt,h=valid.Length,level=0)
sesFortifiedPredictions$mean[1:2]
## [1] 2051.842 1917.207
plot(sesFortifiedPredictions,bty="l")
accuracy(sesFortifiedPredictions$mean[1:2],fortified.Valid[1:2])
## ME RMSE MAE MPE MAPE
## Test set 231.4753 266.806 231.4753 9.986598 9.986598
plot(sesFortifiedPredictions$residuals,xlab="Year",ylab="Forecast Error",bty="n",xaxt="n")
axis(1, at=seq(1980,1995,1), labels=format(seq(1980,1995,1)))
abline(h = 0)
title("Exp Smoothing Forecasting Errors (Training Data)")
Based on this plot, which of the following statements are reasonable? The first three statements are reasonable. The last is not because the Holt-Winter’s method accounts for seasonality.
How can you handle the above effect with exponential smoothing? You can try adjusting the three smoothing constants to optimize the forecasts.
The double-seasonal Holt-Winter’s method by Taylor could be used if there are more than two seasonal cycles.