Relationship between Moving Average and Exponential Smoothing: Assume that we apply a moving average to a series, using a very short window span. If we wanted to achieve an equivalent result using simple exponential smoothing, what value should the smoothing constant take?
In order to get an equivalent result for a moving average with a short window span and a simple exponential smoothing forecast, the SES needs an \(\alpha\) value that is very close to 1. A high \(\alpha\) value translates to a fast learning model meaning the most recent values have the most impact on the forecast and past values are less important.
The pair of plots below show the trailing moving average with a window of 2 in red. The blue line in each represents the simple exponential forecast with a very high and a very low \(\alpha\) value respectively. The SES forecast is superimposed on the MA forecast. The plot on the left betrays less of the MA forecast (shows less red), demonstrating the SES forecast using a higher alpha constant more closely mirrors it.
nValid <- 12
nTrain <- length(ridership.ts) - nValid
train.ts <- window(ridership.ts, start = c(1991, 1), end = c(1991, nTrain))
valid.ts <- window(ridership.ts, start = c(1991, nTrain + 1), end = c(1991, nTrain + nValid))
ma.trailing <- rollmean(train.ts, k = 2, align = "right")
forecast.ma.trailing <-forecast(ma.trailing, h = nValid, level = 0)
ses.ridership <- ets(train.ts, model = "ZZZ", alpha = .98)
ses.pred <- forecast(ses.ridership, h = nValid, level = 0)
ses.ridershipsmallAlpha <- ets(train.ts, model = "ZZZ", alpha = .02)
ses.predSmallAlpha <- forecast(ses.ridershipsmallAlpha, h = nValid, level = 0)
par(mfrow = c(1,2))
yrange <- range(ridership.ts)
plot(c(1991, 2004), yrange, type = "n", xlab = "Year", ylab = "Ridership", bty = "l", xaxt = "n", yaxt = "n", main = "Alpha = 0.98")
axis(1, at = seq(1991, 2004,2), labels = format(seq(1991, 2004, 2)))
#y axis to format without calculations. las = 2 function is so label is perpendicular to axis
axis(2, at = seq(1000, 3000, 300), labels = format(seq(1000, 3000, 300)), las = 2)
lines(forecast.ma.trailing$fitted , col = "red")
lines(ses.pred$fitted, col = "blue")
#lines(ses.predSmallAlpha$fitted, col = "green")
legend(1991, 2300, c(" ", "trailing moving average", "SES alpha = 0.98"), lty = c(1, 1, 1), col = c("white","red", "blue"), bty = "n")
plot(c(1991, 2004), yrange, type = "n", xlab = "Year", ylab = "Ridership", bty = "l", xaxt = "n", yaxt = "n", main = "Alpha = 0.02")
axis(1, at = seq(1991, 2004, 2), labels = format(seq(1991, 2004, 2)))
axis(2, at = seq(1000, 3000, 300), labels = format(seq(1000, 3000, 300)), las = 2)
lines(forecast.ma.trailing$fitted , col = "red")
#lines(ses.pred$fitted, col = "yellow")
lines(ses.predSmallAlpha$fitted, col = "blue")
legend(1991, 2300, c(" ", "trailing moving average", "SES alpha = 0.02"), lty = c(1, 1, 1), col = c("white", "red", "blue"), bty = "n")
Forecasting Department Store Sales
The best forecasting model can only be determined by considering the components of the Department Store Sales data series, namely error, level, trend, and seasonality. While all series have error and level, not all have the latter 2 components. The first step in settling on the best models is to evaluate the series for trend and seasonality.
Trend
validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
#linear trend line
DeptSales.Linear <- tslm(DeptSales.ts ~ trend)
sales.lm.pred <- forecast(DeptSales.Linear, h = validLength, level = 0)
#quadratic trend line
DeptSales.quad <- tslm(DeptSales.ts ~ trend + I(trend^2))
sales.quad <- forecast(DeptSales.quad, h = validLength, level = 0)
DeptSales.Poly <- tslm(DeptSales.ts ~ poly(trend,2))
yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Trends in Department Store Sales \nQuarterly Data")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(lines(DeptSales.ts, bty = "l", lwd = 2))
lines(DeptSales.Linear$fitted , col = "blue", lwd = 2)
lines(DeptSales.Poly$fitted, col = "red", lwd = 2)
legend(1, 100000, c("Dept Store Sales", "Linear Trend, R-sq = 0.3054", "2nd Order Poly Trend, R-sq = 0.3398"), lty = c(1,1,1), col = c("black", "blue", "red"), lwd = c(2,2,2), bty = "n")
Trend in the same series with the data aggregated annually
The second order polynomial trend line mirrors the aggregated data.
# Aggregate by year and plot it
SalesYearly <- aggregate(DeptSales.ts, nfrequency=1, FUN=sum)
YRange <- range(SalesYearly)
plot(c(1,6), YRange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Trends in Department Store Sales \nAggregated Data")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(0, 500000, 10000), labels = format(seq(0, 500, 10)), las = 2)
lines(SalesYearly, bty = "l", lwd = 2, lty = 1)
YrSales.Linear <- tslm(SalesYearly ~ trend)
YrSales.Poly <- tslm(SalesYearly ~ poly(trend,2))
lines(YrSales.Linear$fitted , col = "blue", lty = 1, lwd = 2)
lines(YrSales.Poly$fitted, col = "Red", lwd = 2, lty = 3)
legend(1, 300000, c("Dept Store Sales Aggregated", "Linear Trend, R-sq = 0.3054", "2nd Order Poly Trend, R-sq = 0.3398"), lty = c(1,1,3), col = c("black", "blue", "red"), lwd = c(2,2,2), bty = "n")
Looking at how each quarter behaves over time, there appears to be an trend in each plot. The R-sq, for the trend lines are about 0.30 making the presence of this component ambiguous. Below is another plot where each line represents a quarter. Each quarter and each year shows growth which indicates some trend, perhaps weak, to the Department Store Sales series. However without firm proof of trend, when evaluating for a forecast model, I would test various ones with and without adjusting for trend when necessary.
par(oma = c(0, 0, 0, 2))
xrange <- c(1,6)
yrange <- range(DeptSales.ts/1000)
plot(xrange, yrange, type="n", xlab="Year", ylab="Quarterly Sales", bty="l", las=1)
colors <- c("violet", "red", "green", "blue")
linetype <- c(1,1,1,1)
plotchar <- c(1:4)
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))
for (i in 1:4) {
currentQu <- subset(DeptSales.ts/1000, cycle(DeptSales.ts/1000)==i)
lines(seq(1, 1 +length(currentQu)-1,1), currentQu, type="b", lwd=1,
lty=linetype[i], col=colors[i], pch=plotchar[i])
}
title("Sales Broken Out by Quarter")
legend(6, 90, 1:4, cex=0.8, col=colors, pch=plotchar, lty=linetype, title="Quarter", xpd=NA)
Seasonality
There is indisputably a seasonal component to the Dept. Store Sales series. Quarters behave the same throughout the year. The dramatic quarterly fluctuations in sales during each year makes it essential that an appropriate model is used to forecast this data series.
bold.text <- element_text(face = "bold", color = "black", size = 10)
legendT <- "Legend"
bold.italic <- element_text(face = "bold.italic", color = "black", size = 11)
ggseasonplot(DeptSales.ts/1000, ylab = "Dept Store Sales (thousands)", xlab = "Quarter", main = "Seasonal Plot for Dept Store Sales") +
theme_bw() +
theme(axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank()) +
guides(fill = guide_legend(reverse = TRUE)) +
theme_bw() + theme(panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.line = element_line(colour = "black")) +
## axis.text.x for x axis only
theme(axis.text = bold.text) +
theme(title = bold.italic, axis.title = bold.italic) +
geom_line(size = 1) +
theme(plot.title = element_text(hjust = 0.5))
Which of the following methods would not be suitable for forecasting this series. Explain why or why not for each one.
** * Moving average of raw series:**
This is not a method for forecasting the Dept. Store Sales raw series because, without adjustments to the data, it can not accomodate trend or seasonality, 2 components in this data series.
trailingMA <- rollmean(salesTrain, k=4, align="right")
maForecastedValue <- tail(trailingMA, 1)
# Make this value a time series object for the validation period
maForecasts <- ts(rep(maForecastedValue, validLength), start=c(1,trainLength+1), freq=4)
#Generic Plot
yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Trends in Department Store Sales \nQuarterly Data")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(lines(DeptSales.ts, bty = "l", lwd = 2))
lines(trailingMA, col="red", lty=2)
lines(maForecasts, lwd=2, col="red", lty=2)
legend(1, 105000, c("Dept Store Sales", "Trailing Moving Average", "Moving Average Forecast"), lty = c(1,2,2), col = c("black", "red", "red"), lwd = c(2,1,2), bty = "n")
** * Moving average of deseasonalized series:**
This method is suitable for forecasting the Dept. Store Sales series. Moving average forecasting is for data that demonstrates stationarity, it should not have trend or seasonality. The six year series of department store quarterly sales has a seasonal component; a trend component is less clear. Deseasonalizing the series is essential when forecasting with this method. In order to decide if this particular data series needs detrending in addition to deseasonalizing with differencing, we need to look at how well each performs against the actual sales.
validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
MAwithSeason <- rollmean(diff(salesTrain, lag = 4), k = 4, align = "right")
MASeasForecast <- forecast(MAwithSeason, h = validLength)
realForecasts <- vector()
for (i in 1:validLength){
if(i == 1){
realForecasts[i] <- MASeasForecast$mean [i] + salesTrain[(trainLength+i)-validLength]
#+ (salesTrain[trainLength] - salesTrain[trainLength - validLength])
} else{
realForecasts[i] <- MASeasForecast$mean[i] + salesTrain[(trainLength+i)-validLength]
#+ (realForecasts[i-1] - salesTrain[trainLength+i-1-validLength])
}
}
The plot below shows the two different forecasts against the actual series. The deseasonalized moving average forcast has a lower MAPE score than the double difference moving average forecast.
validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
MAwithSeasonTrend <- rollmean(diff(diff(salesTrain, lag = 4), lag = 1), k = 4, align = "right")
MASeasTrForecast <- forecast(MAwithSeasonTrend, h = validLength)
#MASeasTrForecast
realForecasts1 <- vector()
for (i in 1:validLength){
if(i == 1){
realForecasts1[i] <- MASeasTrForecast$mean [i] + salesTrain[(trainLength+i)-validLength] + (salesTrain[trainLength] - salesTrain[trainLength - validLength])
} else{
realForecasts1[i] <- MASeasTrForecast$mean[i] + salesTrain[(trainLength+i)-validLength] + (realForecasts1[i-1] - salesTrain[trainLength+i-1-validLength])
}
}
yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual and Moving Average Forecasts")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(DeptSales.ts, bty = "l", lwd = 2)
#lines(seq(5, at = seq(6-1/validLength, 1/validLength, realForecasts, col = "blue" , lwd = 2, lty = 2 )))
lines(seq(6, 7-1/validLength, 1/validLength), realForecasts, col="red", lwd=2, lty=2)
lines(seq(6, 7-1/validLength, 1/validLength), realForecasts1, col="blue", lwd=2, lty=3)
legend(1, 100000, c("Actual Sales", "De-Seasonalized, MAPE = 2.38916", "Double Differenced, MAPE = 3.908718"), lty = c(1,2,3), col = c("black", "red", "blue"), lwd = c(2,2,2), bty = "n")
** * Simple exponential smoothing of the raw series:**
SES of the Dept Store Sales’ raw data is not an appropriate forecasting method. Simple Exponential Smoothing, like Moving Average, is for forecasting time series that have no trend or seasonality. However, if the data series is deseasonalized with differencing (lag-4), simple exponential smoothing may be a decent model.
To illustrate that the raw Dept Store Sales data is not adequate for simple exponentoal smoothing, we can run the ets() function specifing the “ANN” model (no trend, no seasonality) and let the model determine the optimal alpha value. The MAPE = 15.54219, far lower than other models we have tested.
validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
yrange <- range(DeptSales.ts)
RawData <- ets(salesTrain, model = "ANN")
RawForecast <- forecast(RawData, h = validLength)
summary(RawData)
## ETS(A,N,N)
##
## Call:
## ets(y = salesTrain, model = "ANN")
##
## Smoothing parameters:
## alpha = 0.1206
##
## Initial states:
## l = 59167.5632
##
## sigma: 13229.28
##
## AIC AICc BIC
## 445.5222 447.0222 448.5094
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 3046.15 13229.28 10409.41 1.213967 15.54219 3.340095
## ACF1
## Training set -0.05805453
If we let the ets() decide which model and which \(\alpha\) value work best for the Dept Store Sales data, it chooses a forecasting model that recognizes multiplicative seasonality and no trend; it also renders a better MAPE. Since an SES requires a model = “ANN”, we can conclude the raw series is not suitable.
SESRaw1 <- ets(salesTrain, model = "ZZZ")
summary(SESRaw1)
## ETS(M,N,M)
##
## Call:
## ets(y = salesTrain, model = "ZZZ")
##
## Smoothing parameters:
## alpha = 0.9907
## gamma = 1e-04
##
## Initial states:
## l = 59697.8836
## s=1.3045 0.9951 0.8621 0.8383
##
## sigma: 0.0255
##
## AIC AICc BIC
## 367.7322 377.0655 374.7023
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 546.4286 1597.297 1338.208 0.819708 2.126239 0.4293945
## ACF1
## Training set -0.234507
To use SES for the Dept Store Sales data, the series would have to be deseasonalized. Even if we deseasonalize the Dept Store Sales data with differencing and run SES “ANN”, the MAPE is worse than many we have seen. The plot below shows the actual series with the SES forecats using raw and deseasonalized data.
#Deseasonalize dept store sales
SESDiffSales <- ets(diff(train.ts, lag = 4), model = "ANN")
SESDiffSalesForecast <- forecast(SESDiffSales, h = validLength, level = 0)
realForecastsSES <- vector()
for (i in 1:validLength){
if(i == 1){
realForecastsSES[i] <- SESDiffSalesForecast$mean [i] + salesTrain[(trainLength+i)-validLength]
#+ (salesTrain[trainLength] - salesTrain[trainLength - validLength])
} else{
realForecastsSES[i] <- SESDiffSalesForecast$mean[i] + salesTrain[(trainLength+i)-validLength]
#+ (realForecastsSES[i-1] - salesTrain[trainLength+i-1-validLength])
}
}
yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual Sales and SES Forecast")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
#lines(seq(5, at = seq(6-1/validLength, 1/validLength, realForecastsSES, col = "blue" , lwd = 2, lty = 2 )))
lines(DeptSales.ts, lwd = 2)
lines(seq(6, 7-1/validLength, 1/validLength), realForecastsSES, col="red", lwd=2, lty=2)
lines(RawForecast$mean, col = "blue", lty = 2, lwd = 2)
legend(1, 100000, c("Actual Sales", "SES lag-4, MAPE = 8.02", "SES Raw Data, MAPE = 15.28"), lty = c(1,2, 2), col = c("black", "red", "blue"), lwd = c(2,2, 2), bty = "n")
** * Double exponential smoothing of the raw series:**
Also known as Holt’s Linear Trend Model, Double Exponential Smoothing is not a suitable method to forecat the Dept Store Sales data. Strictly, double exponential smoothing is used to forecast data that has a trend component. As we learn when running the ets() function, the Dept Store Sales data has multiplicative seasonality and no significant trend; the function identifies “MNM” as the best model. Therefore double exponential smoothing would not be optimal model for the raw data series, it is for errors following “AMN” or “MMN”. If we run the series through the “MMN” (no seasonality) model, we get a forecast with a higher MAPE.
yrange <- range(DeptSales.ts)
hSales <- ets(salesTrain, model = "MMN")
hSales.pred <- forecast(hSales, h = validLength, level = 0)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Dept Store Sales and Holt Linear Trend Model")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(DeptSales.ts)
lines(hSales.pred$mean, lwd = 2, lty = 2, col = "blue")
legend(1, 100000, c("Actual Sales", "Holt's Linear Model 'MMN' \nMAPE = 15.67145"), lty = c(1,2), col = c("black","blue"), lwd = c(2,2), bty = "n")
** * Holt-Winter’s exponential smoothing of the raw series:**
The Holt Winter’s method is suitable for the Dept Store Sales data because it accomodates data that has trend and/or seasonality.
An extension of double exponential smoothing called the Holt Winter’s model is appropriate for a data series that has both trend and seasonality components.
hwinSales <- ets(salesTrain, model = "MNM")
summary(hwinSales)
## ETS(M,N,M)
##
## Call:
## ets(y = salesTrain, model = "MNM")
##
## Smoothing parameters:
## alpha = 0.9907
## gamma = 1e-04
##
## Initial states:
## l = 59697.8836
## s=1.3045 0.9951 0.8621 0.8383
##
## sigma: 0.0255
##
## AIC AICc BIC
## 367.7322 377.0655 374.7023
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 546.4286 1597.297 1338.208 0.819708 2.126239 0.4293945
## ACF1
## Training set -0.234507
hwSalesforecast <- forecast(hwinSales, h = validLength)
accuracy(hwSalesforecast, salesValid)
## ME RMSE MAE MPE MAPE MASE
## Training set 546.4286 1597.297 1338.208 0.819708 2.126239 0.4293945
## Test set 5829.9855 6830.684 5829.985 7.023908 7.023908 1.8706836
## ACF1 Theil's U
## Training set -0.2345070 NA
## Test set 0.2139347 0.4780883
hwinSales$par
## alpha gamma l s0 s1
## 9.907060e-01 1.000180e-04 5.969788e+04 1.304452e+00 9.951355e-01
## s2
## 8.621125e-01
yrange <- range(DeptSales.ts +1)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Dept Store Sales and Holt Winter's Forecast 'MNM'")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(DeptSales.ts)
lines(hwinSales$fitted, lwd = 2, col = "blue")
lines(hwSalesforecast$mean, lwd = 2, lty = 2, col = "blue")
legend(1, 100000, c("Actual Sales", "HW fit MAPE = 2.12", "HW Forecast MAPE = 7.02"), lty = c(1,1,2), col = c("black","blue", "blue"), lwd = c(2,2,2), bty = "n")
5b. A forecaster was tasked to generate forecasts for 4 quarters ahead. She therefore partitioned the data so that the last 4 quarters were designated as the validation period. The forecaster approached the forecasting task by using multiplicative Holt-Winter’s exponential smoothing. Specifically, you should call the ets function with the parameters restrict=FALSE, model = “ZZZ” and use the smoothing constants of \(\alpha=0.2\), \(=\beta=0.15\), and \(\gamma=0.05\).
validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
HWfn <- ets(salesTrain, model = "ZZZ", alpha = 0.2, beta = 0.15, gamma = 0.05, restrict=FALSE)
i. Run this method on the data. Request the forecasts on the validation period. (Note that the forecasted values for the validation set will be different than what the book shows.)
HWforecast <- forecast(HWfn, h = validLength)
HWforecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 62115.16 60268.28 63962.04 59290.60 64939.72
## 6 Q2 65371.60 63317.68 67425.53 62230.40 68512.81
## 6 Q3 77076.87 74420.49 79733.24 73014.29 81139.45
## 6 Q4 102937.73 98935.69 106939.77 96817.13 109058.33
ii. Using the forecasts for the validation set that you came up with in i. above,
*compute the MAPE values for the forecasts of quarters 21-22.**
library(forecast)
#windows for dept sales
timeS <- time(DeptSales.ts)
nvalidS <- 4
ntrainS <- length(DeptSales.ts) - nvalidS
trainWindowDS <- window(DeptSales.ts, start = timeS[1], end = timeS[ntrainS])
validWindowDS <- window(DeptSales.ts, start = timeS[ntrainS +1], end = timeS[ntrainS + nvalidS - 2])
validWindowDS
## Qtr1 Qtr2
## 6 60800 64900
twowindowValid <- forecast(HWfn, h = 2)
twowindowValid
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 62115.16 60268.28 63962.04 59290.6 64939.72
## 6 Q2 65371.60 63317.68 67425.53 62230.4 68512.81
accuracy(twowindowValid, validWindowDS)
## ME RMSE MAE MPE MAPE MASE
## Training set 609.5633 1447.180 1240.9947 0.9315256 1.985092 0.3982014
## Test set -893.3803 987.939 893.3803 -1.4448745 1.444875 0.2866614
## ACF1 Theil's U
## Training set 0.008503999 NA
## Test set -0.500000000 0.1150254
#This is the forecast for the 4 period validation
HWforecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 62115.16 60268.28 63962.04 59290.60 64939.72
## 6 Q2 65371.60 63317.68 67425.53 62230.40 68512.81
## 6 Q3 77076.87 74420.49 79733.24 73014.29 81139.45
## 6 Q4 102937.73 98935.69 106939.77 96817.13 109058.33
accuracy(HWforecast, salesValid)
## ME RMSE MAE MPE MAPE MASE
## Training set 609.5633 1447.1798 1240.9947 0.9315256 1.9850921 0.3982014
## Test set -366.8394 727.6403 566.4741 -0.6517749 0.8449629 0.1817661
## ACF1 Theil's U
## Training set 0.008503999 NA
## Test set 0.183050115 0.02380343
5c. The fit and the residuals were displayed in the book.
** *Please reproduce them with R code. Using all the information from (b) and your generated figures, is this model suitable for forecasting quarters 21 and 22?**
yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual Sales and Holt Winters Model")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(DeptSales.ts, bty = "l", lwd = 2)
lines(HWfn$fitted, col = "blue", lty = 3, lwd = 2)
lines(HWforecast$mean, col = "blue", lty = 2, lwd = 2)
legend(1, 100000, c("Actual Sales", "Holt Winters Fitted", "Holt Winters Forecast"), lty = c(1,3,2), col = c("black", "blue", "blue"), lwd = c(2,2,2), bty = "n")
plot(HWfn$residuals, xlab = "Year", ylab = "Dept Store Sales Residuals", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Dept Store Sales Residuals", bty = "n")
axis(1, at = seq(1, 6, 1), labels = format(seq(1, 6, 1)), pos = 0)
axis(2, at = seq(-0.04, 0.06, .01), labels = format(seq(-4000, 6000, 1000)), las = 2)
5d. Another analyst decided to take a much simpler approach, and instead of using exponential smoothing he used differencing. Use differencing to remove the trend and seasonal pattern. Which order works better: first removing trend and then seasonality or the opposite order?
There is no difference.
par(mfrow = c(1,2))
#double differenced
doubleDiff <- diff(diff(DeptSales.ts, lag = 1), lag = 4)
#plot double differenced
plot(doubleDiff, ylab="Lag-1, then Lag-4", xlab="Year", bty="l", main="Dept Sales Twice-Differenced \nDe-Trended \nDe-Seasonalized")
#double differenced
doubleDiff1 <- diff(diff(DeptSales.ts, lag = 4), lag = 1)
#plot double differenced
plot(doubleDiff1, ylab="Lag-4, then Lag-1", xlab="Year", bty="l", main="Dept Sales Twice-Differenced \nDe-Seasonalized \nDe-Trended")
Show the progression of time plots as you difference the data and each final series to provide evidence in support of your answer.
#set up a plot with 2 rows and 2 columns
par(mfrow = c(2,2))
#plot the original
plot(DeptSales.ts, ylab = "Dept Store Sales", xlab = "Year", bty = "l", main = "A. Dept Store Sales")
#lag 4 differencing
lag4diff <- diff(DeptSales.ts, lag=4)
#plot the lag-4 difference
plot(lag4diff, ylab="Lag-4", xlab="Year", bty="l", main="B. Lag-4 Differencing \nDe-Seasonalized")
#lag-1 differencing for detrending
lag1diff <- diff(DeptSales.ts, lag=1)
#plot the lag-1 difference for detrending
plot(lag1diff, ylab="Lag-1", xlab="Year", bty="l", main="C. Lag-1 Differencing \nDe-trended")
doubleDiff <- diff(diff(DeptSales.ts, lag = 4), lag = 1)
#plot double differenced
plot(doubleDiff, ylab="Lag-4, then Lag-1", xlab="Year", bty="l", main="D. Dept Sales Twice-Differenced \nDe-Seasonalized \nDe-Trended")
5e. Forecast quarters 21-22 using the average of the double-differenced series from (d). Remember to use only the training period (until quarter 20), and to adjust back for the trend and seasonal pattern.
validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
MAwithSeasonTrend <- meanf(diff(diff(salesTrain, lag = 4), lag = 1))
MASeasTrForecast <- forecast(MAwithSeasonTrend, h = validLength)
realForecasts1 <- vector()
for (i in 1:validLength){
if(i == 1){
realForecasts1[i] <- MASeasTrForecast$mean [i] + salesTrain[(trainLength+i)-validLength] + (salesTrain[trainLength] - salesTrain[trainLength - validLength])
} else{
realForecasts1[i] <- MASeasTrForecast$mean[i] + salesTrain[(trainLength+i)-validLength] + (realForecasts1[i-1] - salesTrain[trainLength+i-1-validLength])
}
}
realForecasts1
## [1] 63982.2 68177.4 80201.6 101467.8
yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual and Moving Average Forecasts")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(DeptSales.ts, bty = "l", lwd = 2)
lines(seq(6, 7-1/validLength, 1/validLength), realForecasts1, col="blue", lwd=2, lty=3)
lines(HWforecast$mean, col = "red", lty = 2, lwd = 2)
legend(1, 100000, c("Actual Sales", "Double Differenced, Mean Forecast MAPE = 4.07", "Holt Winter's Forecast, MAPE = 0.8449629"), lty = c(1,3,2), col = c("black", "blue", "red"), lwd = c(2,2,2), bty = "n")
realForecasts1
## [1] 63982.2 68177.4 80201.6 101467.8
HWforecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 62115.16 60268.28 63962.04 59290.60 64939.72
## 6 Q2 65371.60 63317.68 67425.53 62230.40 68512.81
## 6 Q3 77076.87 74420.49 79733.24 73014.29 81139.45
## 6 Q4 102937.73 98935.69 106939.77 96817.13 109058.33
HWforecast$mean
## Qtr1 Qtr2 Qtr3 Qtr4
## 6 62115.16 65371.60 77076.87 102937.73
frange <- range(salesValid)
frange
## [1] 60800 103337
plot(c(1,4), frange, type = "n", xlab = "Quarter", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual Sales and Holt Winters Model")
axis(1, at = seq(1, 4, 1), labels = format(seq(1, 4, 1)))
axis(2, at = seq(60000, 105000, 10000), labels = format(seq(60, 105, 10)), las = 2)
#lines(validWindow, bty = "l", lwd = 2)
lines(realForecasts1, col = "blue", lty = 3, lwd = 2)
#lines(HWforecast, col = "blue", lty = 2, lwd = 2)
legend(1, 100000, c("Actual Sales", "Holt Winters Fitted", "Holt Winters Forecast"), lty = c(1,3,2), col = c("black", "blue", "blue"), lwd = c(2,2,2), bty = "n")
5f. Compare the forecasts from (e) to the exponential smoothing forecasts found in (b). Which of the two forecasting methods would you choose? Explain.
Accuracy meanf double differenced
accuracy(realForecasts1, salesValid)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set -1948.75 2942.41 2883.35 -3.159236 4.063656 -0.07650809 0.1948234
Accuracy HW
accuracy(HWforecast, salesValid)
## ME RMSE MAE MPE MAPE MASE
## Training set 609.5633 1447.1798 1240.9947 0.9315256 1.9850921 0.3982014
## Test set -366.8394 727.6403 566.4741 -0.6517749 0.8449629 0.1817661
## ACF1 Theil's U
## Training set 0.008503999 NA
## Test set 0.183050115 0.02380343
Based on the accuracy of the validation period and the training period fit, one might intuitively choose the smoothing method because the MAPE scores are so good. However, I would need to balance this with the danger of overfitting. The smoothing method fits very tightly with the actual sales in the training and validation periods. If I chose the smoothing model, I would continue to evaluate future sales and forecasts.
5g. What is an even simpler approach that should be compared as a baseline? Complete that comparison.
A naive forecast should always be run as a baseline. It is very simple. It may not always give a forecast with the best predictive accuracy, but it is necessary to run as a baseline. Because the Dept Store Sales series has a seasonal component, we use the snaive() function.
validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
NaiveSales <- snaive(salesTrain, h = validLength)
NaiveSales
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 56405 51434.14 61375.86 48802.72 64007.28
## 6 Q2 60031 55060.14 65001.86 52428.72 67633.28
## 6 Q3 71486 66515.14 76456.86 63883.72 79088.28
## 6 Q4 92183 87212.14 97153.86 84580.72 99785.28
tail(DeptSales, 8)
## Yr_Qtr Sales
## 17 Year 5 Q-5 56405
## 18 Year 5 Q-2 60031
## 19 Year 5 Q-3 71486
## 20 Year 5 Q-4 92183
## 21 Year 6 Q-1 60800
## 22 Year 6 Q-2 64900
## 23 Year 6 Q-3 76997
## 24 Year 6 Q-4 103337
yrange <- range(DeptSales.ts +1)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Dept Store Sales and Naive Modeling")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(DeptSales.ts)
lines(NaiveSales$fitted, lwd = 2, col = "blue")
lines(NaiveSales$mean, lwd = 2, lty = 2, col = "blue")
legend(1, 100000, c("Actual Sales", "Naive Fit MAPE = 4.74", "Naive Forecast MAPE = 8.17"), lty = c(1,1,2), col = c("black","blue", "blue"), lwd = c(2,2,2), bty = "n")
accuracy(NaiveSales, salesValid)
## ME RMSE MAE MPE MAPE MASE
## Training set 2925.25 3878.784 3116.50 4.344358 4.737739 1.000000
## Test set 6482.25 7032.176 6482.25 8.170540 8.170540 2.079978
## ACF1 Theil's U
## Training set 0.54855812 NA
## Test set 0.01334402 0.4705349
Forecasting Australian Wine Sales: Figure 5.14 shows time plots of monthly sales of six types of Australian wines (red, rose, sweet white, dry white, sparkling, and fortified) for 1980-1994. Data available in AustralianWines.xls. 23 The units are thousands of liters. You are hired to obtain short-term forecasts (2-3 months ahead) for each of the six series, and this task will be repeated every month.
wine <- read.csv("AustralianWines.csv")
head(wine)
## Month Fortified Red Rose sparkling Sweet.white Dry.white
## 1 Jan-80 2585 464 112 1686 85 1954
## 2 Feb-80 3368 675 118 1591 89 2302
## 3 Mar-80 3210 703 129 2304 109 3054
## 4 Apr-80 3111 887 99 1712 95 2414
## 5 May-80 3756 1139 116 1471 91 2226
## 6 Jun-80 4216 1077 168 1377 95 2725
dim(wine)
## [1] 188 7
names(wine)
## [1] "Month" "Fortified" "Red" "Rose" "sparkling"
## [6] "Sweet.white" "Dry.white"
(a) Which smoothing method would you choose if you had to choose the same method for forecasting all series? Why?
In order to decide on one method for short term smoothing method for forecasting all of the various wine styles, I would choose Holt Winters. Of the smoothing methods we have covered, this is the only one which is suitable for series that have trend and/or seasonality components.
I ran the ets() function with model = “ZZZ” and let the model choose the \(\alpha\), \(=\beta\), and \(\gamma\) values. These results are shown at the end of the report. Each wine style has a seasonal component and all but Sparkling and Sweet have a trend component. According to our text the Holt Winter’s method is very versatile with these components as well. “Being an adaptive method, Holt-Winter’s exponential smoothing allows the level, trend, and seasonality patterns to change over time. The three components are estimated and updated as new information arrives.” This is important given that the Australian Wine styles all have varying seasonal patterns with different cycles that would otherwise may not be recognized using simple smoothing with differencing.
(b) Fortified wine has the largest market share of the six types of wine. You are asked to focus on fortified wine sales alone and produce as accurate a forecast as possible for the next two months.
Start by partitioning the data using the period until Dec- 1993 as the training period.
Fortified.ts <- ts(wine$Fortified, start = c(1980, 1), frequency = 12)
FortifiedvalidLength <- 12
FortifiedtrainLength <- length(Fortified.ts) - FortifiedvalidLength
FortifiedsalesTrain <- window(Fortified.ts, end = c(1980, FortifiedtrainLength))
FortifiedsalesValid <- window(Fortified.ts, start = c(1980, FortifiedtrainLength + 1))
library(forecast)
FortifiedSmooth <- ets(FortifiedsalesTrain, model = "ZZM")
FortifiedSmooth
## ETS(M,A,M)
##
## Call:
## ets(y = FortifiedsalesTrain, model = "ZZM")
##
## Smoothing parameters:
## alpha = 0.0496
## beta = 9e-04
## gamma = 1e-04
##
## Initial states:
## l = 4041.6833
## b = -7.1612
## s=1.1334 1.0383 0.8836 0.9461 1.2608 1.3697
## 1.1428 1.1137 0.9614 0.8588 0.7053 0.5862
##
## sigma: 0.0884
##
## AIC AICc BIC
## 2897.314 2901.188 2951.212
FortifiedForecast <- forecast(FortifiedSmooth, h = FortifiedvalidLength)
accuracy(FortifiedForecast, FortifiedsalesValid)
## ME RMSE MAE MPE MAPE MASE
## Training set -19.28044 289.5123 225.3113 -1.195219 7.36695 0.8230731
## Test set 328.28969 389.1661 328.2897 13.003315 13.00332 NA
## ACF1 Theil's U
## Training set 0.0638394 NA
## Test set -0.7537966 0.730869
ForecastSmooth2Mos <- ets(Fortified.ts, model = "ZZM")
ForecastZZM2Mos <- forecast(ForecastSmooth2Mos, h = 2)
ForecastZZM2Mos
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 1995 1252.988 1106.713 1399.263 1029.280 1476.696
## Feb 1995 1499.512 1324.309 1674.715 1231.563 1767.462
Apply Holt-Winter’s exponential smoothing (with multiplicative seasonality) to sales.
(c) Create a plot for the residuals from the Holt-Winter’s exponential smoothing.
plot(FortifiedSmooth$residuals, xlab = "Year", ylab = "Fortified Wine Sales Residuals", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Fortified Wine Sales Residuals", bty = "n")
axis(1, at = seq(1980, 1995, 1), labels = format(seq(1980, 1995, 1)))
axis(2, at = seq(-0.4, 0.4, .1), labels = format(seq(-4000, 4000, 1000)), las = 2)
i. Based on this plot, which of the following statements are reasonable? Refer to the residuals plot for this question.
Decembers (month 12) are not captured well by the model. This is difficult to determine given that there is so much data. We would need to zoom in to look closely at each month.
There is a strong correlation between sales on the same calendar month. This is not a reasonable statement because this residual plot does not provide correlation. To look for such a relationship, we could consider regression analysis.
The model does not capture the seasonality well.This is a reasonable statement because the residual plot is not intended to capture seasonality. The time series plot and other plots that decompose errors of the time series components would provide this information.
We should first deseasonalize the data and then apply Holt-Winter’s exponential smoothing. This is not a reasonable statement because the Holt Winter’s model is one that accomodates seasonality, no matter the complexity of the seasonal cycles. Deaseaonalized data is only necessary when being fit to models that are for data with stationarity.
ii. How can you handle the above effect with exponential smoothing?
Higher value and frequency of residuals can be minimized with a good forecasting model. Exponential smoothing models are versatile because the can fit a variety of components, namely, trend and seasonality.
After performing any forecast, it’s important to explore the error and the residuals. For this Fortified Wine series, the residuals are normally distributed. We see this on the histogram and the Normal QQ Plot.
# Plot the histogram and store it to use later
myhist <- hist(FortifiedSmooth$residuals, ylab="Frequency", xlab="Forecast Error", main="", bty="l")
# Use stored hist object to help set up density curve
multiplier <- myhist$counts / myhist$density
# Need to ignore NA from 1985
mydensity <- density(FortifiedSmooth$residuals, na.rm=TRUE)
mydensity$y <- mydensity$y * multiplier[1]
# Add the density curve
lines(mydensity)
length(FortifiedSmooth$residuals)
## [1] 176
qqnorm(FortifiedSmooth$residuals[1:176])
qqline(FortifiedSmooth$residuals[1:176])
plot(Fortified.ts, xlab = "Year", ylab = "Sales (thousands)", col = "black", main = "Fortified Wine Sales", lwd = 2)
lines(FortifiedSmooth$fitted, col = "blue", lty = 2, lwd = 2)
lines(FortifiedForecast$mean, lty = 3, lwd = 3, col = "blue")
legend(1980, 2400, c("Actual Sales", "HW Fitted MAPE = 7.37", "Forecast MAPE = 13"), lty = c(1,2,3), col = c("black", "blue", "blue"), lwd = c(2,2,3), bty = "n")
abline(v=1994)
arrows(1980, 5000, 1994, 5000, code=3, length=0.1)
text(1985, 5300, "Training")
abline(v=1995)
arrows(1994, 5000, 1995, 5000, code=3, length=0.1)
text(1994.5, 5300, "Val")
Information pertaining to Q.5
Linear Trend Summary
summary(DeptSales.Linear)
##
## Call:
## tslm(formula = DeptSales.ts ~ trend)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13897 -10094 -2577 6421 25132
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 50141 5372 9.334 4.17e-09 ***
## trend 1169 376 3.110 0.0051 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12750 on 22 degrees of freedom
## Multiple R-squared: 0.3054, Adjusted R-squared: 0.2738
## F-statistic: 9.674 on 1 and 22 DF, p-value: 0.005101
2nd Order Polynomial Trend Summary
summary(DeptSales.Poly)
##
## Call:
## tslm(formula = DeptSales.ts ~ poly(trend, 2))
##
## Residuals:
## Min 1Q Median 3Q Max
## -15444 -8297 -4950 5517 20416
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 64757 2597 24.937 < 2e-16 ***
## poly(trend, 2)1 39654 12722 3.117 0.00522 **
## poly(trend, 2)2 13313 12722 1.046 0.30724
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12720 on 21 degrees of freedom
## Multiple R-squared: 0.3398, Adjusted R-squared: 0.277
## F-statistic: 5.405 on 2 and 21 DF, p-value: 0.01277
5a, bullet 2
Point Forecast Deseasonalized Moving Average
MASeasForecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 5872.463 4909.405 6835.520 4399.593 7345.332
## 6 Q2 5872.463 4510.562 7234.364 3789.615 7955.311
## 6 Q3 5872.463 4204.509 7540.416 3321.548 8423.377
## 6 Q4 5872.463 3946.492 7798.433 2926.945 8817.981
Accuracy of Deseasonalized Moving Average Forecast for Dept. Store Sales
accuracy(realForecasts, salesValid)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set 609.7874 2793.525 2030.981 0.1663322 2.38916 0.01334402 0.1790003
Point Forecast Deseasonalized and Detrended Moving Average
MASeasTrForecast
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1 569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q2 569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q3 569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q4 569.2 -2116.935 3255.335 -3714.114 4852.514
Accuracy of Deseasonalized and Detrended Moving Average Forecast for Dept. Store Sales
accuracy(realForecasts1, salesValid)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set -1948.75 2942.41 2883.35 -3.159236 4.063656 -0.07650809 0.1948234
5a, bullet 3
Accuracy of Deseasonalized (lag - 4) SES model “ANN” Forecast for Dept Store Sales data.
accuracy(realForecastsSES, salesValid)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set 6374.794 6933.249 6374.794 8.024077 8.024077 0.01334402 0.4639576
Accuracy of raw data used in SES model “ANN” Forecast for Dept Store Sales data.
accuracy(RawForecast, salesValid)
## ME RMSE MAE MPE MAPE MASE
## Training set 3046.150 13229.28 10409.41 1.213967 15.54219 3.340095
## Test set 9992.464 19370.85 13658.50 9.338171 15.28388 4.382641
## ACF1 Theil's U
## Training set -0.05805453 NA
## Test set 0.17229544 1.278413
5a, bullet 4
Holt’s Linear Trend Model, aka, Double Exponential Smoothing is not a suitable model for forecasting the Dept. Store Sales data because the model is not built to recognize seasonality in the data and this particular data set has a multiplicative seasonal component.
accuracy(hSales.pred, salesValid)
## ME RMSE MAE MPE MAPE MASE
## Training set -628.9547 11502.29 9730.394 -4.2059461 15.31484 3.122218
## Test set 2386.1109 15706.02 12638.761 -0.7391866 15.67145 4.055434
## ACF1 Theil's U
## Training set -0.1328986 NA
## Test set 0.1630194 0.9830232
summary(hSales)
## ETS(M,M,N)
##
## Call:
## ets(y = salesTrain, model = "MMN")
##
## Smoothing parameters:
## alpha = 1e-04
## beta = 1e-04
##
## Initial states:
## l = 54381.329
## b = 1.0139
##
## sigma: 0.1805
##
## AIC AICc BIC
## 443.3564 447.6421 448.3350
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -628.9547 11502.29 9730.394 -4.205946 15.31484 3.122218
## ACF1
## Training set -0.1328986
5e
validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
MAwithSeasonTrend <- meanf(diff(diff(salesTrain, lag = 4), lag = 1))
MASeasTrForecast <- forecast(MAwithSeasonTrend, h = validLength)
realForecasts1 <- vector()
for (i in 1:validLength){
if(i == 1){
realForecasts1[i] <- MASeasTrForecast$mean [i] + salesTrain[(trainLength+i)-validLength] + (salesTrain[trainLength] - salesTrain[trainLength - validLength])
} else{
realForecasts1[i] <- MASeasTrForecast$mean[i] + salesTrain[(trainLength+i)-validLength] + (realForecasts1[i-1] - salesTrain[trainLength+i-1-validLength])
}
}
realForecasts1
## [1] 63982.2 68177.4 80201.6 101467.8
yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual and Moving Average Forecasts")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(DeptSales.ts, bty = "l", lwd = 2)
lines(seq(6, 7-1/validLength, 1/validLength), realForecasts1, col="blue", lwd=2, lty=3)
lines(HWforecast$mean, col = "red", lty = 2, lwd = 2)
legend(1, 100000, c("Actual Sales", "Double Differenced, Mean Forecast MAPE = 4.07", "Holt Winter's Forecast, MAPE = 0.8449629"), lty = c(1,3,2), col = c("black", "blue", "red"), lwd = c(2,2,2), bty = "n")
Accuracy meanf double differenced
accuracy(realForecasts1, salesValid)
## ME RMSE MAE MPE MAPE ACF1 Theil's U
## Test set -1948.75 2942.41 2883.35 -3.159236 4.063656 -0.07650809 0.1948234
Accuracy HW
accuracy(HWforecast, salesValid)
## ME RMSE MAE MPE MAPE MASE
## Training set 609.5633 1447.1798 1240.9947 0.9315256 1.9850921 0.3982014
## Test set -366.8394 727.6403 566.4741 -0.6517749 0.8449629 0.1817661
## ACF1 Theil's U
## Training set 0.008503999 NA
## Test set 0.183050115 0.02380343
Fortified
Fortified.ts <- ts(wine$Fortified, start = c(1980, 1), frequency = 12)
FortifiedvalidLength <- 12
FortifiedtrainLength <- length(Fortified.ts) - FortifiedvalidLength
FortifiedsalesTrain <- window(Fortified.ts, end = c(1980, FortifiedtrainLength))
FortifiedsalesValid <- window(Fortified.ts, start = c(1980, FortifiedtrainLength + 1))
plot(Fortified.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Fortified Wine Sales")
Fortified <- ets(FortifiedsalesTrain, model = "ZZZ")
Fortified
## ETS(M,A,M)
##
## Call:
## ets(y = FortifiedsalesTrain, model = "ZZZ")
##
## Smoothing parameters:
## alpha = 0.0496
## beta = 9e-04
## gamma = 1e-04
##
## Initial states:
## l = 4041.6833
## b = -7.1612
## s=1.1334 1.0383 0.8836 0.9461 1.2608 1.3697
## 1.1428 1.1137 0.9614 0.8588 0.7053 0.5862
##
## sigma: 0.0884
##
## AIC AICc BIC
## 2897.314 2901.188 2951.212
Red
Red.ts <- ts(wine$Red, start = c(1980, 1), frequency = 12)
RedvalidLength <- 12
RedtrainLength <- length(Red.ts) - RedvalidLength
RedsalesTrain <- window(Red.ts, end = c(1980, RedtrainLength))
RedsalesValid <- window(Red.ts, start = c(1980, RedtrainLength + 1))
plot(Red.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Red Wine Sales")
Red <- ets(RedsalesTrain, model = "ZZZ")
Red
## ETS(M,A,M)
##
## Call:
## ets(y = RedsalesTrain, model = "ZZZ")
##
## Smoothing parameters:
## alpha = 0.1449
## beta = 1e-04
## gamma = 1e-04
##
## Initial states:
## l = 932.1379
## b = 6.938
## s=1.1144 1.0194 0.9387 1.0434 1.361 1.343
## 1.0962 1.0763 0.9302 0.8465 0.7041 0.5267
##
## sigma: 0.1064
##
## AIC AICc BIC
## 2728.253 2732.127 2782.152
Rose
Rose.ts <- ts(wine$Rose, start = c(1980, 1), frequency = 12)
RosevalidLength <- 12
RosetrainLength <- length(Rose.ts) - RosevalidLength
RosesalesTrain <- window(Rose.ts, end = c(1980, RosetrainLength))
RosesalesValid <- window(Rose.ts, start = c(1980, RosetrainLength + 1))
plot(Rose.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Rose Sales")
Rose <- ets(RosesalesTrain, model = "ZZZ")
Rose
## ETS(A,A,A)
##
## Call:
## ets(y = RosesalesTrain, model = "ZZZ")
##
## Smoothing parameters:
## alpha = 0.0019
## beta = 0.0019
## gamma = 1e-04
##
## Initial states:
## l = 24.9058
## b = 0.3992
## s=-14.6118 -17.8191 -7.9477 -2.2117 -7.6001 7.2329
## 7.541 -0.5594 24.8803 1.4364 2.4008 7.2584
##
## sigma: 25.0584
##
## AIC AICc BIC
## 2077.871 2081.745 2131.770
Sparkling
Sparkling.ts <- ts(wine$sparkling, start = c(1980, 1), frequency = 12)
SparklingvalidLength <- 12
SparklingtrainLength <- length(Sparkling.ts) - SparklingvalidLength
SparklingsalesTrain <- window(Sparkling.ts, end = c(1980, SparklingtrainLength))
SparklingsalesValid <- window(Sparkling.ts, start = c(1980, SparklingtrainLength + 1))
plot(Sparkling.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Sparkling Sales")
Sparkling <- ets(SparklingsalesTrain, model = "ZZZ")
Sparkling
## ETS(M,N,M)
##
## Call:
## ets(y = SparklingsalesTrain, model = "ZZZ")
##
## Smoothing parameters:
## alpha = 0.0636
## gamma = 1e-04
##
## Initial states:
## l = 2321.7212
## s=2.3704 1.6783 1.2386 0.8789 0.9682 0.7987
## 0.5945 0.6527 0.7302 0.7709 0.6554 0.6632
##
## sigma: 0.1413
##
## AIC AICc BIC
## 2954.800 2957.800 3002.357
Sweet
Sweet.white.ts <- ts(wine$Sweet.white, start = c(1980, 1), frequency = 12)
SweetvalidLength <- 12
SweettrainLength <- length(Sweet.white.ts) - SweetvalidLength
SweetsalesTrain <- window(Sweet.white.ts, end = c(1980, SweettrainLength))
SweetsalesValid <- window(Sweet.white.ts, start = c(1980, SweettrainLength + 1))
plot(Sweet.white.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Sweet Wine Sales")
Sweet <- ets(SweetsalesTrain, model = "ZZZ")
Sweet
## ETS(M,N,M)
##
## Call:
## ets(y = SweetsalesTrain, model = "ZZZ")
##
## Smoothing parameters:
## alpha = 0.4288
## gamma = 1e-04
##
## Initial states:
## l = 123.0858
## s=1.4785 1.2792 1.0335 1.012 1.0886 1.0413
## 0.7662 0.7838 0.9097 0.8779 0.8651 0.8642
##
## sigma: 0.1826
##
## AIC AICc BIC
## 2236.288 2239.288 2283.845
Dry
Dry.white.ts <- ts(wine$Dry.white, start = c(1980, 1), frequency = 12)
DryvalidLength <- 12
DrytrainLength <- length(Dry.white.ts) - DryvalidLength
DrysalesTrain <- window(Dry.white.ts, end = c(1980, DrytrainLength))
DrysalesValid <- window(Dry.white.ts, start = c(1980, DrytrainLength + 1))
plot(Dry.white.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Dry Wine Sales")
Dry <- ets(DrysalesTrain, model = "ZZZ")
Dry
## ETS(M,A,A)
##
## Call:
## ets(y = DrysalesTrain, model = "ZZZ")
##
## Smoothing parameters:
## alpha = 0.0599
## beta = 1e-04
## gamma = 1e-04
##
## Initial states:
## l = 2936.1051
## b = 4.5617
## s=1242.073 829.3723 109.0329 -94.204 391.6654 88.7661
## -383.1171 -262.2344 -385.946 -159.6419 -449.8025 -925.9641
##
## sigma: 0.0998
##
## AIC AICc BIC
## 2968.520 2972.394 3022.418
*Some things I looked at to answer 8i because I was curious.
Comparison of Fortified Wine Residuals from models with different alpha values
## ETS(M,A,M)
##
## Call:
## ets(y = FortifiedsalesTrain, model = "ZZM", alpha = 0.01)
##
## Smoothing parameters:
## alpha = 0.01
## beta = 0.001
## gamma = 1e-04
##
## Initial states:
## l = 3999.9022
## b = -12.6402
## s=1.1299 1.0399 0.8838 0.9487 1.2602 1.3725
## 1.1439 1.1118 0.9603 0.858 0.7048 0.5863
##
## sigma: 0.0887
##
## AIC AICc BIC
## 2891.799 2895.221 2942.527
## ME RMSE MAE MPE MAPE MASE
## Training set 22.27415 283.9810 219.2211 0.09590192 7.100508 0.8008257
## Test set 300.90845 364.6416 300.9084 11.87303212 11.873032 NA
## ACF1 Theil's U
## Training set 0.06948922 NA
## Test set -0.75241127 0.6889046
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 1995 1252.988 1106.713 1399.263 1029.280 1476.696
## Feb 1995 1499.512 1324.309 1674.715 1231.563 1767.462
## ETS(M,A,M)
##
## Call:
## ets(y = FortifiedsalesTrain, model = "ZZM", alpha = 0.2)
##
## Smoothing parameters:
## alpha = 0.2
## beta = 0.0011
## gamma = 1e-04
##
## Initial states:
## l = 4039.8137
## b = -5.3673
## s=1.1192 1.0468 0.8893 0.96 1.2638 1.3653
## 1.1326 1.1144 0.9631 0.8574 0.7012 0.5869
##
## sigma: 0.092
##
## AIC AICc BIC
## 2909.647 2913.069 2960.375
## ME RMSE MAE MPE MAPE MASE
## Training set -17.57194 296.5439 230.0785 -1.303048 7.588995 0.8404881
## Test set 327.08136 380.8848 327.0814 12.951081 12.951081 NA
## ACF1 Theil's U
## Training set -0.006807277 NA
## Test set -0.758065433 0.7205477
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 1995 1252.988 1106.713 1399.263 1029.280 1476.696
## Feb 1995 1499.512 1324.309 1674.715 1231.563 1767.462
## ETS(M,N,M)
##
## Call:
## ets(y = FortifiedsalesTrain, model = "ZZM", alpha = 0.6)
##
## Smoothing parameters:
## alpha = 0.6
## gamma = 1e-04
##
## Initial states:
## l = 4209.5721
## s=1.1196 1.0385 0.8869 0.9488 1.2687 1.3684
## 1.1495 1.1255 0.9632 0.8586 0.6923 0.5799
##
## sigma: 0.1023
##
## AIC AICc BIC
## 2943.293 2945.902 2987.680
## ME RMSE MAE MPE MAPE MASE
## Training set -22.83443 325.7266 255.8295 -1.505787 8.554785 0.9345579
## Test set 504.23155 544.7288 504.2315 20.305433 20.305433 NA
## ACF1 Theil's U
## Training set -0.200693 NA
## Test set -0.772165 0.9865081
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 1995 1252.988 1106.713 1399.263 1029.280 1476.696
## Feb 1995 1499.512 1324.309 1674.715 1231.563 1767.462
## ETS(M,N,M)
##
## Call:
## ets(y = FortifiedsalesTrain, model = "ZZM", alpha = 0.98)
##
## Smoothing parameters:
## alpha = 0.98
## gamma = 1e-04
##
## Initial states:
## l = 4209.2442
## s=1.1154 1.0358 0.8828 0.9507 1.2616 1.3678
## 1.156 1.132 0.9624 0.8503 0.6962 0.589
##
## sigma: 0.1173
##
## AIC AICc BIC
## 2989.718 2992.327 3034.105
## ME RMSE MAE MPE MAPE MASE
## Training set -13.17981 366.9950 288.7133 -1.217969 9.807884 1.054684
## Test set 609.76039 643.3313 609.7604 24.691127 24.691127 NA
## ACF1 Theil's U
## Training set -0.4168382 NA
## Test set -0.7661006 1.156751
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 1995 1252.988 1106.713 1399.263 1029.280 1476.696
## Feb 1995 1499.512 1324.309 1674.715 1231.563 1767.462
Do residuals behave the same for each month of the year?
par(oma = c(0, 0, 0, 2))
xrange <- c(1,18)
yrange <- range(FortifiedSmooth$residuals)
plot(xrange, yrange, type="n", xlab="Year", ylab="Residuals", bty="n", las=1, xaxt = "n", yaxt = "n")
colors <- rainbow((15))
axis(1, at=seq(1,15,1), labels=format(seq(1,15,1)))
axis(2, at = seq(-0.2, 0.4, 0.05), labels = format(seq(-2000, 4000, 500)))
for (i in 1:12) {
currentQu <- subset(FortifiedSmooth$residuals, cycle(FortifiedSmooth$residuals)==i)
lines(seq(1, 1 +length(currentQu)-1,1), currentQu, type="l", lwd=1, col=colors[i])
}
title("Fortified Wine Sales Broken out by Month")
legend(16, .3, 1:12, cex=0.8, col=colors
#, pch=plotchar
, lty=linetype
, title="Month", xpd=NA)
Looking at Fortified Wine Residuals one year at a time.
barplot(FortifiedSmooth$residuals[1:12], col = "blue", bty = "l")
barplot(FortifiedSmooth$residuals[13:24], col = "red", bty = "l")
barplot(FortifiedSmooth$residuals[25:36], col = "blue")
barplot(FortifiedSmooth$residuals[37:48], col = "red", bty = "l")
barplot(FortifiedSmooth$residuals[49:60], col = "blue")
barplot(FortifiedSmooth$residuals[61:72], col = "red", bty = "l")
barplot(FortifiedSmooth$residuals[73:84], col = "blue")
barplot(FortifiedSmooth$residuals[85:96], col = "red", bty = "l")
barplot(FortifiedSmooth$residuals[97:108], col = "blue")
barplot(FortifiedSmooth$residuals[109:120], col = "red", bty = "l")
barplot(FortifiedSmooth$residuals[121:132], col = "blue")
barplot(FortifiedSmooth$residuals[133:144], col = "red", bty = "l")
barplot(FortifiedSmooth$residuals[145:156], col = "blue")
barplot(FortifiedSmooth$residuals[157:168], col = "red", bty = "l")
barplot(FortifiedSmooth$residuals[169:176], col = "blue")
Tracking Fortified Wine Residuals by month
ggseasonplot(FortifiedSmooth$residuals)