Smoothing

2.

Relationship between Moving Average and Exponential Smoothing: Assume that we apply a moving average to a series, using a very short window span. If we wanted to achieve an equivalent result using simple exponential smoothing, what value should the smoothing constant take?

In order to get an equivalent result for a moving average with a short window span and a simple exponential smoothing forecast, the SES needs an \(\alpha\) value that is very close to 1. A high \(\alpha\) value translates to a fast learning model meaning the most recent values have the most impact on the forecast and past values are less important.

The pair of plots below show the trailing moving average with a window of 2 in red. The blue line in each represents the simple exponential forecast with a very high and a very low \(\alpha\) value respectively. The SES forecast is superimposed on the MA forecast. The plot on the left betrays less of the MA forecast (shows less red), demonstrating the SES forecast using a higher alpha constant more closely mirrors it.

nValid <- 12
nTrain <- length(ridership.ts) - nValid
train.ts <- window(ridership.ts, start = c(1991, 1), end = c(1991, nTrain))
valid.ts <- window(ridership.ts, start = c(1991, nTrain + 1), end = c(1991, nTrain + nValid))
ma.trailing <- rollmean(train.ts, k = 2, align = "right")
forecast.ma.trailing <-forecast(ma.trailing, h = nValid, level = 0)

ses.ridership <- ets(train.ts, model = "ZZZ", alpha = .98)
ses.pred <- forecast(ses.ridership, h = nValid, level = 0)
ses.ridershipsmallAlpha <- ets(train.ts, model = "ZZZ", alpha = .02)
ses.predSmallAlpha <- forecast(ses.ridershipsmallAlpha, h = nValid, level = 0)

par(mfrow = c(1,2))

yrange <- range(ridership.ts)
plot(c(1991, 2004), yrange, type = "n", xlab = "Year", ylab = "Ridership", bty = "l", xaxt = "n", yaxt = "n", main = "Alpha = 0.98")
axis(1, at = seq(1991, 2004,2), labels = format(seq(1991, 2004, 2)))
#y axis to format without calculations. las = 2 function is so label is perpendicular to axis
axis(2, at = seq(1000, 3000, 300), labels = format(seq(1000, 3000, 300)), las = 2)
lines(forecast.ma.trailing$fitted , col = "red")
lines(ses.pred$fitted, col = "blue")
#lines(ses.predSmallAlpha$fitted, col = "green")
legend(1991, 2300, c(" ", "trailing moving average", "SES alpha = 0.98"), lty = c(1, 1, 1), col = c("white","red", "blue"), bty = "n")

plot(c(1991, 2004), yrange, type = "n", xlab = "Year", ylab = "Ridership", bty = "l", xaxt = "n", yaxt = "n", main = "Alpha = 0.02")
axis(1, at = seq(1991, 2004, 2), labels = format(seq(1991, 2004, 2)))

axis(2, at = seq(1000, 3000, 300), labels = format(seq(1000, 3000, 300)), las = 2)
lines(forecast.ma.trailing$fitted , col = "red")
#lines(ses.pred$fitted, col = "yellow")
lines(ses.predSmallAlpha$fitted, col = "blue")
legend(1991, 2300, c(" ", "trailing moving average", "SES alpha = 0.02"), lty = c(1, 1, 1), col = c("white", "red", "blue"), bty = "n")

5.

Forecasting Department Store Sales

The best forecasting model can only be determined by considering the components of the Department Store Sales data series, namely error, level, trend, and seasonality. While all series have error and level, not all have the latter 2 components. The first step in settling on the best models is to evaluate the series for trend and seasonality.

Trend

validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
#linear trend line
DeptSales.Linear <- tslm(DeptSales.ts ~ trend)
sales.lm.pred <- forecast(DeptSales.Linear, h = validLength, level = 0)
#quadratic trend line
DeptSales.quad <- tslm(DeptSales.ts ~ trend + I(trend^2))
sales.quad <- forecast(DeptSales.quad, h = validLength, level = 0)
DeptSales.Poly <- tslm(DeptSales.ts ~ poly(trend,2))
yrange <- range(DeptSales.ts)

plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Trends in Department Store Sales \nQuarterly Data")

axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(lines(DeptSales.ts, bty = "l", lwd = 2))
lines(DeptSales.Linear$fitted , col = "blue", lwd = 2)

lines(DeptSales.Poly$fitted, col = "red", lwd = 2)
legend(1, 100000, c("Dept Store Sales", "Linear Trend, R-sq = 0.3054", "2nd Order Poly Trend, R-sq = 0.3398"), lty = c(1,1,1), col = c("black", "blue", "red"), lwd = c(2,2,2),  bty = "n")

Trend in the same series with the data aggregated annually

The second order polynomial trend line mirrors the aggregated data.

# Aggregate by year and plot it
SalesYearly <- aggregate(DeptSales.ts, nfrequency=1, FUN=sum)
YRange <- range(SalesYearly)
plot(c(1,6), YRange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Trends in Department Store Sales \nAggregated Data")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(0, 500000, 10000), labels = format(seq(0, 500, 10)), las = 2)
lines(SalesYearly, bty = "l", lwd = 2, lty = 1)
YrSales.Linear <- tslm(SalesYearly ~ trend)
YrSales.Poly <- tslm(SalesYearly ~ poly(trend,2))
lines(YrSales.Linear$fitted , col = "blue", lty = 1, lwd = 2)
lines(YrSales.Poly$fitted, col = "Red", lwd = 2, lty = 3)
legend(1, 300000, c("Dept Store Sales Aggregated", "Linear Trend, R-sq = 0.3054", "2nd Order Poly Trend, R-sq = 0.3398"), lty = c(1,1,3), col = c("black", "blue", "red"), lwd = c(2,2,2),  bty = "n")

Looking at how each quarter behaves over time, there appears to be an trend in each plot. The R-sq, for the trend lines are about 0.30 making the presence of this component ambiguous. Below is another plot where each line represents a quarter. Each quarter and each year shows growth which indicates some trend, perhaps weak, to the Department Store Sales series. However without firm proof of trend, when evaluating for a forecast model, I would test various ones with and without adjusting for trend when necessary.

par(oma = c(0, 0, 0, 2))
xrange <- c(1,6)
yrange <- range(DeptSales.ts/1000)
plot(xrange, yrange, type="n", xlab="Year", ylab="Quarterly Sales", bty="l", las=1)
colors <- c("violet", "red", "green", "blue")
linetype <- c(1,1,1,1) 
plotchar <- c(1:4)
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))
for (i in 1:4) { 
  currentQu <- subset(DeptSales.ts/1000, cycle(DeptSales.ts/1000)==i)
  lines(seq(1, 1 +length(currentQu)-1,1), currentQu, type="b", lwd=1,
      lty=linetype[i], col=colors[i], pch=plotchar[i]) 
} 
title("Sales Broken Out by Quarter")
legend(6, 90, 1:4, cex=0.8, col=colors, pch=plotchar, lty=linetype, title="Quarter", xpd=NA)

Seasonality

There is indisputably a seasonal component to the Dept. Store Sales series. Quarters behave the same throughout the year. The dramatic quarterly fluctuations in sales during each year makes it essential that an appropriate model is used to forecast this data series.

bold.text <- element_text(face = "bold", color = "black", size = 10)
legendT <- "Legend"

bold.italic <- element_text(face = "bold.italic", color = "black", size = 11)
ggseasonplot(DeptSales.ts/1000, ylab = "Dept Store Sales (thousands)", xlab = "Quarter",  main = "Seasonal Plot for Dept Store Sales") + 
  theme_bw() +
  theme(axis.line = element_line(colour = "black"),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    panel.border = element_blank(),
    panel.background = element_blank()) + 
  guides(fill = guide_legend(reverse = TRUE)) +  
  theme_bw() + theme(panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.line = element_line(colour = "black")) +
  
  ## axis.text.x for x axis only
theme(axis.text = bold.text) +
theme(title = bold.italic, axis.title = bold.italic) +
  geom_line(size = 1) +
  theme(plot.title = element_text(hjust = 0.5))

5a.

Which of the following methods would not be suitable for forecasting this series. Explain why or why not for each one.

** * Moving average of raw series:**

This is not a method for forecasting the Dept. Store Sales raw series because, without adjustments to the data, it can not accomodate trend or seasonality, 2 components in this data series.

trailingMA <- rollmean(salesTrain, k=4, align="right")
maForecastedValue <- tail(trailingMA, 1)

# Make this value a time series object for the validation period
maForecasts <- ts(rep(maForecastedValue, validLength), start=c(1,trainLength+1), freq=4)


#Generic Plot
yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Trends in Department Store Sales \nQuarterly Data")

axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(lines(DeptSales.ts, bty = "l", lwd = 2))
lines(trailingMA, col="red", lty=2)

lines(maForecasts, lwd=2, col="red", lty=2)

legend(1, 105000, c("Dept Store Sales", "Trailing Moving Average", "Moving Average Forecast"), lty = c(1,2,2), col = c("black", "red", "red"), lwd = c(2,1,2),  bty = "n")

** * Moving average of deseasonalized series:**

This method is suitable for forecasting the Dept. Store Sales series. Moving average forecasting is for data that demonstrates stationarity, it should not have trend or seasonality. The six year series of department store quarterly sales has a seasonal component; a trend component is less clear. Deseasonalizing the series is essential when forecasting with this method. In order to decide if this particular data series needs detrending in addition to deseasonalizing with differencing, we need to look at how well each performs against the actual sales.

validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
MAwithSeason <- rollmean(diff(salesTrain, lag = 4), k = 4, align = "right")
MASeasForecast <- forecast(MAwithSeason, h = validLength)


realForecasts <- vector()
for (i in 1:validLength){
  if(i == 1){
    
    realForecasts[i] <- MASeasForecast$mean [i]  + salesTrain[(trainLength+i)-validLength] 
    #+ (salesTrain[trainLength] - salesTrain[trainLength - validLength])
  } else{
    realForecasts[i] <- MASeasForecast$mean[i] + salesTrain[(trainLength+i)-validLength] 
    #+ (realForecasts[i-1] - salesTrain[trainLength+i-1-validLength])
  }
}

The plot below shows the two different forecasts against the actual series. The deseasonalized moving average forcast has a lower MAPE score than the double difference moving average forecast.

validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
MAwithSeasonTrend <- rollmean(diff(diff(salesTrain, lag = 4), lag = 1), k = 4, align = "right")
MASeasTrForecast <- forecast(MAwithSeasonTrend, h = validLength)
#MASeasTrForecast

realForecasts1 <- vector()
for (i in 1:validLength){
  if(i == 1){
    
    realForecasts1[i] <- MASeasTrForecast$mean [i]  + salesTrain[(trainLength+i)-validLength] + (salesTrain[trainLength] - salesTrain[trainLength - validLength])
  } else{
    realForecasts1[i] <- MASeasTrForecast$mean[i] + salesTrain[(trainLength+i)-validLength] + (realForecasts1[i-1] - salesTrain[trainLength+i-1-validLength])
  }
}

yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual and Moving Average Forecasts")

axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(DeptSales.ts, bty = "l", lwd = 2)
#lines(seq(5, at = seq(6-1/validLength, 1/validLength, realForecasts, col = "blue" , lwd = 2, lty = 2 )))
lines(seq(6, 7-1/validLength, 1/validLength), realForecasts, col="red", lwd=2, lty=2)
lines(seq(6, 7-1/validLength, 1/validLength), realForecasts1, col="blue", lwd=2, lty=3)

legend(1, 100000, c("Actual Sales", "De-Seasonalized, MAPE = 2.38916", "Double Differenced, MAPE = 3.908718"), lty = c(1,2,3), col = c("black", "red", "blue"), lwd = c(2,2,2),  bty = "n")

** * Simple exponential smoothing of the raw series:**

SES of the Dept Store Sales’ raw data is not an appropriate forecasting method. Simple Exponential Smoothing, like Moving Average, is for forecasting time series that have no trend or seasonality. However, if the data series is deseasonalized with differencing (lag-4), simple exponential smoothing may be a decent model.

To illustrate that the raw Dept Store Sales data is not adequate for simple exponentoal smoothing, we can run the ets() function specifing the “ANN” model (no trend, no seasonality) and let the model determine the optimal alpha value. The MAPE = 15.54219, far lower than other models we have tested.

validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
yrange <- range(DeptSales.ts)
RawData <- ets(salesTrain, model = "ANN")
RawForecast <- forecast(RawData, h = validLength)
summary(RawData)

## ETS(A,N,N) 
## 
## Call:
##  ets(y = salesTrain, model = "ANN") 
## 
##   Smoothing parameters:
##     alpha = 0.1206 
## 
##   Initial states:
##     l = 59167.5632 
## 
##   sigma:  13229.28
## 
##      AIC     AICc      BIC 
## 445.5222 447.0222 448.5094 
## 
## Training set error measures:
##                   ME     RMSE      MAE      MPE     MAPE     MASE
## Training set 3046.15 13229.28 10409.41 1.213967 15.54219 3.340095
##                     ACF1
## Training set -0.05805453

If we let the ets() decide which model and which \(\alpha\) value work best for the Dept Store Sales data, it chooses a forecasting model that recognizes multiplicative seasonality and no trend; it also renders a better MAPE. Since an SES requires a model = “ANN”, we can conclude the raw series is not suitable.

SESRaw1 <- ets(salesTrain, model = "ZZZ")
summary(SESRaw1)

## ETS(M,N,M) 
## 
## Call:
##  ets(y = salesTrain, model = "ZZZ") 
## 
##   Smoothing parameters:
##     alpha = 0.9907 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 59697.8836 
##     s=1.3045 0.9951 0.8621 0.8383
## 
##   sigma:  0.0255
## 
##      AIC     AICc      BIC 
## 367.7322 377.0655 374.7023 
## 
## Training set error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 546.4286 1597.297 1338.208 0.819708 2.126239 0.4293945
##                   ACF1
## Training set -0.234507

To use SES for the Dept Store Sales data, the series would have to be deseasonalized. Even if we deseasonalize the Dept Store Sales data with differencing and run SES “ANN”, the MAPE is worse than many we have seen. The plot below shows the actual series with the SES forecats using raw and deseasonalized data.

#Deseasonalize dept store sales 

SESDiffSales <- ets(diff(train.ts, lag = 4), model = "ANN")

SESDiffSalesForecast <- forecast(SESDiffSales, h = validLength, level = 0)


realForecastsSES <- vector()
for (i in 1:validLength){
  if(i == 1){
    
    realForecastsSES[i] <- SESDiffSalesForecast$mean [i]  + salesTrain[(trainLength+i)-validLength] 
    #+ (salesTrain[trainLength] - salesTrain[trainLength - validLength])
  } else{
    realForecastsSES[i] <- SESDiffSalesForecast$mean[i] + salesTrain[(trainLength+i)-validLength] 
    #+ (realForecastsSES[i-1] - salesTrain[trainLength+i-1-validLength])
  }
}

yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual Sales and SES Forecast")

axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
#lines(seq(5, at = seq(6-1/validLength, 1/validLength, realForecastsSES, col = "blue" , lwd = 2, lty = 2 )))
lines(DeptSales.ts, lwd = 2)
lines(seq(6, 7-1/validLength, 1/validLength), realForecastsSES, col="red", lwd=2, lty=2)
lines(RawForecast$mean, col = "blue", lty = 2, lwd = 2)

legend(1, 100000, c("Actual Sales", "SES lag-4, MAPE = 8.02", "SES Raw Data, MAPE = 15.28"), lty = c(1,2, 2), col = c("black", "red", "blue"), lwd = c(2,2, 2),  bty = "n")

** * Double exponential smoothing of the raw series:**

Also known as Holt’s Linear Trend Model, Double Exponential Smoothing is not a suitable method to forecat the Dept Store Sales data. Strictly, double exponential smoothing is used to forecast data that has a trend component. As we learn when running the ets() function, the Dept Store Sales data has multiplicative seasonality and no significant trend; the function identifies “MNM” as the best model. Therefore double exponential smoothing would not be optimal model for the raw data series, it is for errors following “AMN” or “MMN”. If we run the series through the “MMN” (no seasonality) model, we get a forecast with a higher MAPE.

yrange <- range(DeptSales.ts)
hSales <- ets(salesTrain, model = "MMN") 
hSales.pred <- forecast(hSales, h = validLength, level = 0)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Dept Store Sales and Holt Linear Trend Model")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2) 
lines(DeptSales.ts)

lines(hSales.pred$mean, lwd = 2, lty = 2, col = "blue")
legend(1, 100000, c("Actual Sales", "Holt's Linear Model 'MMN' \nMAPE = 15.67145"), lty = c(1,2), col = c("black","blue"), lwd = c(2,2),  bty = "n")

** * Holt-Winter’s exponential smoothing of the raw series:**

The Holt Winter’s method is suitable for the Dept Store Sales data because it accomodates data that has trend and/or seasonality.

An extension of double exponential smoothing called the Holt Winter’s model is appropriate for a data series that has both trend and seasonality components.

hwinSales <- ets(salesTrain, model = "MNM")
summary(hwinSales)

## ETS(M,N,M) 
## 
## Call:
##  ets(y = salesTrain, model = "MNM") 
## 
##   Smoothing parameters:
##     alpha = 0.9907 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 59697.8836 
##     s=1.3045 0.9951 0.8621 0.8383
## 
##   sigma:  0.0255
## 
##      AIC     AICc      BIC 
## 367.7322 377.0655 374.7023 
## 
## Training set error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 546.4286 1597.297 1338.208 0.819708 2.126239 0.4293945
##                   ACF1
## Training set -0.234507

hwSalesforecast <- forecast(hwinSales, h = validLength)
accuracy(hwSalesforecast, salesValid)

##                     ME     RMSE      MAE      MPE     MAPE      MASE
## Training set  546.4286 1597.297 1338.208 0.819708 2.126239 0.4293945
## Test set     5829.9855 6830.684 5829.985 7.023908 7.023908 1.8706836
##                    ACF1 Theil's U
## Training set -0.2345070        NA
## Test set      0.2139347 0.4780883

hwinSales$par

##        alpha        gamma            l           s0           s1 
## 9.907060e-01 1.000180e-04 5.969788e+04 1.304452e+00 9.951355e-01 
##           s2 
## 8.621125e-01

yrange <- range(DeptSales.ts +1)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Dept Store Sales and Holt Winter's Forecast 'MNM'")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2) 
lines(DeptSales.ts)
lines(hwinSales$fitted, lwd = 2, col = "blue") 
lines(hwSalesforecast$mean, lwd = 2, lty = 2, col = "blue")
legend(1, 100000, c("Actual Sales", "HW fit MAPE = 2.12", "HW Forecast MAPE = 7.02"), lty = c(1,1,2), col = c("black","blue", "blue"), lwd = c(2,2,2),  bty = "n")

5b. A forecaster was tasked to generate forecasts for 4 quarters ahead. She therefore partitioned the data so that the last 4 quarters were designated as the validation period. The forecaster approached the forecasting task by using multiplicative Holt-Winter’s exponential smoothing. Specifically, you should call the ets function with the parameters restrict=FALSE, model = “ZZZ” and use the smoothing constants of \(\alpha=0.2\), \(=\beta=0.15\), and \(\gamma=0.05\).

validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))

HWfn <- ets(salesTrain, model = "ZZZ", alpha = 0.2, beta = 0.15, gamma = 0.05, restrict=FALSE)

i. Run this method on the data. Request the forecasts on the validation period. (Note that the forecasted values for the validation set will be different than what the book shows.)

HWforecast <- forecast(HWfn, h = validLength)
HWforecast

##      Point Forecast    Lo 80     Hi 80    Lo 95     Hi 95
## 6 Q1       62115.16 60268.28  63962.04 59290.60  64939.72
## 6 Q2       65371.60 63317.68  67425.53 62230.40  68512.81
## 6 Q3       77076.87 74420.49  79733.24 73014.29  81139.45
## 6 Q4      102937.73 98935.69 106939.77 96817.13 109058.33

ii. Using the forecasts for the validation set that you came up with in i. above,

*compute the MAPE values for the forecasts of quarters 21-22.**

library(forecast)
#windows for dept sales
timeS <- time(DeptSales.ts)
nvalidS <- 4
ntrainS <- length(DeptSales.ts) - nvalidS
trainWindowDS <- window(DeptSales.ts, start = timeS[1], end = timeS[ntrainS])
validWindowDS <- window(DeptSales.ts, start = timeS[ntrainS +1], end = timeS[ntrainS + nvalidS - 2])
validWindowDS

##    Qtr1  Qtr2
## 6 60800 64900

twowindowValid <- forecast(HWfn, h = 2)
twowindowValid

##      Point Forecast    Lo 80    Hi 80   Lo 95    Hi 95
## 6 Q1       62115.16 60268.28 63962.04 59290.6 64939.72
## 6 Q2       65371.60 63317.68 67425.53 62230.4 68512.81

accuracy(twowindowValid, validWindowDS)

##                     ME     RMSE       MAE        MPE     MAPE      MASE
## Training set  609.5633 1447.180 1240.9947  0.9315256 1.985092 0.3982014
## Test set     -893.3803  987.939  893.3803 -1.4448745 1.444875 0.2866614
##                      ACF1 Theil's U
## Training set  0.008503999        NA
## Test set     -0.500000000 0.1150254

#This is the forecast for the 4 period validation
HWforecast

##      Point Forecast    Lo 80     Hi 80    Lo 95     Hi 95
## 6 Q1       62115.16 60268.28  63962.04 59290.60  64939.72
## 6 Q2       65371.60 63317.68  67425.53 62230.40  68512.81
## 6 Q3       77076.87 74420.49  79733.24 73014.29  81139.45
## 6 Q4      102937.73 98935.69 106939.77 96817.13 109058.33

accuracy(HWforecast, salesValid)

##                     ME      RMSE       MAE        MPE      MAPE      MASE
## Training set  609.5633 1447.1798 1240.9947  0.9315256 1.9850921 0.3982014
## Test set     -366.8394  727.6403  566.4741 -0.6517749 0.8449629 0.1817661
##                     ACF1  Theil's U
## Training set 0.008503999         NA
## Test set     0.183050115 0.02380343

5c. The fit and the residuals were displayed in the book.

** *Please reproduce them with R code. Using all the information from (b) and your generated figures, is this model suitable for forecasting quarters 21 and 22?**

yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual Sales and Holt Winters Model")

axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(DeptSales.ts, bty = "l", lwd = 2)
lines(HWfn$fitted, col = "blue", lty = 3, lwd = 2)
lines(HWforecast$mean, col = "blue", lty = 2, lwd = 2)
legend(1, 100000, c("Actual Sales", "Holt Winters Fitted", "Holt Winters Forecast"), lty = c(1,3,2), col = c("black", "blue", "blue"), lwd = c(2,2,2),  bty = "n")

plot(HWfn$residuals, xlab = "Year", ylab = "Dept Store Sales Residuals", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Dept Store Sales Residuals", bty = "n")
axis(1, at = seq(1, 6, 1), labels = format(seq(1, 6, 1)), pos = 0)
axis(2, at = seq(-0.04, 0.06, .01), labels = format(seq(-4000, 6000, 1000)), las = 2)

5d. Another analyst decided to take a much simpler approach, and instead of using exponential smoothing he used differencing. Use differencing to remove the trend and seasonal pattern. Which order works better: first removing trend and then seasonality or the opposite order?

There is no difference.

par(mfrow = c(1,2))
#double differenced
doubleDiff <- diff(diff(DeptSales.ts, lag = 1), lag = 4)
#plot double differenced
plot(doubleDiff,  ylab="Lag-1, then Lag-4", xlab="Year", bty="l", main="Dept Sales Twice-Differenced \nDe-Trended \nDe-Seasonalized")
#double differenced
doubleDiff1 <- diff(diff(DeptSales.ts, lag = 4), lag = 1)
#plot double differenced
plot(doubleDiff1,  ylab="Lag-4, then Lag-1", xlab="Year", bty="l", main="Dept Sales Twice-Differenced \nDe-Seasonalized \nDe-Trended")

Show the progression of time plots as you difference the data and each final series to provide evidence in support of your answer.

#set up a plot with 2 rows and 2 columns
par(mfrow = c(2,2))
#plot the original
plot(DeptSales.ts, ylab = "Dept Store Sales", xlab = "Year", bty = "l", main = "A. Dept Store Sales")
#lag 4 differencing
lag4diff <- diff(DeptSales.ts, lag=4)
#plot the lag-4 difference
plot(lag4diff,  ylab="Lag-4", xlab="Year", bty="l", main="B. Lag-4 Differencing \nDe-Seasonalized")
#lag-1 differencing for detrending
lag1diff <- diff(DeptSales.ts, lag=1)


#plot the lag-1 difference for detrending
plot(lag1diff,  ylab="Lag-1", xlab="Year", bty="l", main="C. Lag-1 Differencing \nDe-trended")
doubleDiff <- diff(diff(DeptSales.ts, lag = 4), lag = 1)
#plot double differenced
plot(doubleDiff,  ylab="Lag-4, then Lag-1", xlab="Year", bty="l", main="D. Dept Sales Twice-Differenced \nDe-Seasonalized \nDe-Trended")

5e. Forecast quarters 21-22 using the average of the double-differenced series from (d). Remember to use only the training period (until quarter 20), and to adjust back for the trend and seasonal pattern.

validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
MAwithSeasonTrend <- meanf(diff(diff(salesTrain, lag = 4), lag = 1))
MASeasTrForecast <- forecast(MAwithSeasonTrend, h = validLength)
realForecasts1 <- vector()
for (i in 1:validLength){
  if(i == 1){
    
    realForecasts1[i] <- MASeasTrForecast$mean [i]  + salesTrain[(trainLength+i)-validLength] + (salesTrain[trainLength] - salesTrain[trainLength - validLength])
  } else{
    realForecasts1[i] <- MASeasTrForecast$mean[i] + salesTrain[(trainLength+i)-validLength] + (realForecasts1[i-1] - salesTrain[trainLength+i-1-validLength])
  }
}
realForecasts1

## [1]  63982.2  68177.4  80201.6 101467.8

yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual and Moving Average Forecasts")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(DeptSales.ts, bty = "l", lwd = 2)
lines(seq(6, 7-1/validLength, 1/validLength), realForecasts1, col="blue", lwd=2, lty=3)
lines(HWforecast$mean, col = "red", lty = 2, lwd = 2)
legend(1, 100000, c("Actual Sales", "Double Differenced, Mean Forecast MAPE = 4.07", "Holt Winter's Forecast, MAPE = 0.8449629"), lty = c(1,3,2), col = c("black", "blue", "red"), lwd = c(2,2,2),  bty = "n")

realForecasts1

## [1]  63982.2  68177.4  80201.6 101467.8

HWforecast

##      Point Forecast    Lo 80     Hi 80    Lo 95     Hi 95
## 6 Q1       62115.16 60268.28  63962.04 59290.60  64939.72
## 6 Q2       65371.60 63317.68  67425.53 62230.40  68512.81
## 6 Q3       77076.87 74420.49  79733.24 73014.29  81139.45
## 6 Q4      102937.73 98935.69 106939.77 96817.13 109058.33

HWforecast$mean

##        Qtr1      Qtr2      Qtr3      Qtr4
## 6  62115.16  65371.60  77076.87 102937.73

frange <- range(salesValid)

frange

## [1]  60800 103337

plot(c(1,4), frange, type = "n", xlab = "Quarter", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual Sales and Holt Winters Model")

axis(1, at = seq(1, 4, 1), labels = format(seq(1, 4, 1)))
axis(2, at = seq(60000, 105000, 10000), labels = format(seq(60, 105, 10)), las = 2)
#lines(validWindow, bty = "l", lwd = 2)
lines(realForecasts1, col = "blue", lty = 3, lwd = 2)
#lines(HWforecast, col = "blue", lty = 2, lwd = 2)
legend(1, 100000, c("Actual Sales", "Holt Winters Fitted", "Holt Winters Forecast"), lty = c(1,3,2), col = c("black", "blue", "blue"), lwd = c(2,2,2),  bty = "n")

5f. Compare the forecasts from (e) to the exponential smoothing forecasts found in (b). Which of the two forecasting methods would you choose? Explain.

Accuracy meanf double differenced

accuracy(realForecasts1, salesValid)

##                ME    RMSE     MAE       MPE     MAPE        ACF1 Theil's U
## Test set -1948.75 2942.41 2883.35 -3.159236 4.063656 -0.07650809 0.1948234

Accuracy HW

accuracy(HWforecast, salesValid)

##                     ME      RMSE       MAE        MPE      MAPE      MASE
## Training set  609.5633 1447.1798 1240.9947  0.9315256 1.9850921 0.3982014
## Test set     -366.8394  727.6403  566.4741 -0.6517749 0.8449629 0.1817661
##                     ACF1  Theil's U
## Training set 0.008503999         NA
## Test set     0.183050115 0.02380343

Based on the accuracy of the validation period and the training period fit, one might intuitively choose the smoothing method because the MAPE scores are so good. However, I would need to balance this with the danger of overfitting. The smoothing method fits very tightly with the actual sales in the training and validation periods. If I chose the smoothing model, I would continue to evaluate future sales and forecasts.

5g. What is an even simpler approach that should be compared as a baseline? Complete that comparison.

A naive forecast should always be run as a baseline. It is very simple. It may not always give a forecast with the best predictive accuracy, but it is necessary to run as a baseline. Because the Dept Store Sales series has a seasonal component, we use the snaive() function.

validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
NaiveSales <- snaive(salesTrain, h = validLength)
NaiveSales

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 6 Q1          56405 51434.14 61375.86 48802.72 64007.28
## 6 Q2          60031 55060.14 65001.86 52428.72 67633.28
## 6 Q3          71486 66515.14 76456.86 63883.72 79088.28
## 6 Q4          92183 87212.14 97153.86 84580.72 99785.28

tail(DeptSales, 8)

##        Yr_Qtr  Sales
## 17 Year 5 Q-5  56405
## 18 Year 5 Q-2  60031
## 19 Year 5 Q-3  71486
## 20 Year 5 Q-4  92183
## 21 Year 6 Q-1  60800
## 22 Year 6 Q-2  64900
## 23 Year 6 Q-3  76997
## 24 Year 6 Q-4 103337

yrange <- range(DeptSales.ts +1)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Dept Store Sales and Naive Modeling")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2) 
lines(DeptSales.ts)
lines(NaiveSales$fitted, lwd = 2, col = "blue") 
lines(NaiveSales$mean, lwd = 2, lty = 2, col = "blue")
legend(1, 100000, c("Actual Sales", "Naive Fit MAPE = 4.74", "Naive Forecast MAPE = 8.17"), lty = c(1,1,2), col = c("black","blue", "blue"), lwd = c(2,2,2),  bty = "n")

accuracy(NaiveSales, salesValid)

##                   ME     RMSE     MAE      MPE     MAPE     MASE
## Training set 2925.25 3878.784 3116.50 4.344358 4.737739 1.000000
## Test set     6482.25 7032.176 6482.25 8.170540 8.170540 2.079978
##                    ACF1 Theil's U
## Training set 0.54855812        NA
## Test set     0.01334402 0.4705349

8

Forecasting Australian Wine Sales: Figure 5.14 shows time plots of monthly sales of six types of Australian wines (red, rose, sweet white, dry white, sparkling, and fortified) for 1980-1994. Data available in AustralianWines.xls. 23 The units are thousands of liters. You are hired to obtain short-term forecasts (2-3 months ahead) for each of the six series, and this task will be repeated every month.

wine <- read.csv("AustralianWines.csv")
head(wine)

##    Month Fortified  Red Rose sparkling Sweet.white Dry.white
## 1 Jan-80      2585  464  112      1686          85      1954
## 2 Feb-80      3368  675  118      1591          89      2302
## 3 Mar-80      3210  703  129      2304         109      3054
## 4 Apr-80      3111  887   99      1712          95      2414
## 5 May-80      3756 1139  116      1471          91      2226
## 6 Jun-80      4216 1077  168      1377          95      2725

dim(wine)

## [1] 188   7

names(wine)

## [1] "Month"       "Fortified"   "Red"         "Rose"        "sparkling"  
## [6] "Sweet.white" "Dry.white"

(a) Which smoothing method would you choose if you had to choose the same method for forecasting all series? Why?

In order to decide on one method for short term smoothing method for forecasting all of the various wine styles, I would choose Holt Winters. Of the smoothing methods we have covered, this is the only one which is suitable for series that have trend and/or seasonality components.

I ran the ets() function with model = “ZZZ” and let the model choose the \(\alpha\), \(=\beta\), and \(\gamma\) values. These results are shown at the end of the report. Each wine style has a seasonal component and all but Sparkling and Sweet have a trend component. According to our text the Holt Winter’s method is very versatile with these components as well. “Being an adaptive method, Holt-Winter’s exponential smoothing allows the level, trend, and seasonality patterns to change over time. The three components are estimated and updated as new information arrives.” This is important given that the Australian Wine styles all have varying seasonal patterns with different cycles that would otherwise may not be recognized using simple smoothing with differencing.

(b) Fortified wine has the largest market share of the six types of wine. You are asked to focus on fortified wine sales alone and produce as accurate a forecast as possible for the next two months.

Start by partitioning the data using the period until Dec- 1993 as the training period.

Fortified.ts <- ts(wine$Fortified, start = c(1980, 1), frequency = 12)
FortifiedvalidLength <- 12
FortifiedtrainLength <- length(Fortified.ts) - FortifiedvalidLength
FortifiedsalesTrain <- window(Fortified.ts, end = c(1980, FortifiedtrainLength))
FortifiedsalesValid <- window(Fortified.ts, start = c(1980, FortifiedtrainLength + 1))

library(forecast)
FortifiedSmooth <- ets(FortifiedsalesTrain, model = "ZZM")
FortifiedSmooth

## ETS(M,A,M) 
## 
## Call:
##  ets(y = FortifiedsalesTrain, model = "ZZM") 
## 
##   Smoothing parameters:
##     alpha = 0.0496 
##     beta  = 9e-04 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 4041.6833 
##     b = -7.1612 
##     s=1.1334 1.0383 0.8836 0.9461 1.2608 1.3697
##            1.1428 1.1137 0.9614 0.8588 0.7053 0.5862
## 
##   sigma:  0.0884
## 
##      AIC     AICc      BIC 
## 2897.314 2901.188 2951.212

FortifiedForecast <- forecast(FortifiedSmooth, h = FortifiedvalidLength)

accuracy(FortifiedForecast, FortifiedsalesValid)

##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -19.28044 289.5123 225.3113 -1.195219  7.36695 0.8230731
## Test set     328.28969 389.1661 328.2897 13.003315 13.00332        NA
##                    ACF1 Theil's U
## Training set  0.0638394        NA
## Test set     -0.7537966  0.730869

ForecastSmooth2Mos <- ets(Fortified.ts, model = "ZZM")
ForecastZZM2Mos <- forecast(ForecastSmooth2Mos, h = 2)
ForecastZZM2Mos

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 1995       1252.988 1106.713 1399.263 1029.280 1476.696
## Feb 1995       1499.512 1324.309 1674.715 1231.563 1767.462

Apply Holt-Winter’s exponential smoothing (with multiplicative seasonality) to sales.

(c) Create a plot for the residuals from the Holt-Winter’s exponential smoothing.

plot(FortifiedSmooth$residuals, xlab = "Year", ylab = "Fortified Wine Sales Residuals", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Fortified Wine Sales Residuals", bty = "n")
axis(1, at = seq(1980, 1995, 1), labels = format(seq(1980, 1995, 1)))
axis(2, at = seq(-0.4, 0.4, .1), labels = format(seq(-4000, 4000, 1000)), las = 2)

i. Based on this plot, which of the following statements are reasonable? Refer to the residuals plot for this question.

Decembers (month 12) are not captured well by the model. This is difficult to determine given that there is so much data. We would need to zoom in to look closely at each month.
There is a strong correlation between sales on the same calendar month. This is not a reasonable statement because this residual plot does not provide correlation. To look for such a relationship, we could consider regression analysis.
The model does not capture the seasonality well.This is a reasonable statement because the residual plot is not intended to capture seasonality. The time series plot and other plots that decompose errors of the time series components would provide this information.
We should first deseasonalize the data and then apply Holt-Winter’s exponential smoothing. This is not a reasonable statement because the Holt Winter’s model is one that accomodates seasonality, no matter the complexity of the seasonal cycles. Deaseaonalized data is only necessary when being fit to models that are for data with stationarity.

ii. How can you handle the above effect with exponential smoothing?

Higher value and frequency of residuals can be minimized with a good forecasting model. Exponential smoothing models are versatile because the can fit a variety of components, namely, trend and seasonality.

After performing any forecast, it’s important to explore the error and the residuals. For this Fortified Wine series, the residuals are normally distributed. We see this on the histogram and the Normal QQ Plot.

# Plot the histogram and store it to use later
myhist <- hist(FortifiedSmooth$residuals, ylab="Frequency", xlab="Forecast Error", main="", bty="l")

# Use stored hist object to help set up density curve
multiplier <- myhist$counts / myhist$density

# Need to ignore NA from 1985
mydensity <- density(FortifiedSmooth$residuals, na.rm=TRUE)
mydensity$y <- mydensity$y * multiplier[1]

# Add the density curve
lines(mydensity)

length(FortifiedSmooth$residuals)

## [1] 176

qqnorm(FortifiedSmooth$residuals[1:176])
qqline(FortifiedSmooth$residuals[1:176])

plot(Fortified.ts, xlab = "Year", ylab = "Sales (thousands)", col = "black", main = "Fortified Wine Sales", lwd = 2)

lines(FortifiedSmooth$fitted, col = "blue", lty = 2, lwd = 2)
lines(FortifiedForecast$mean, lty = 3, lwd = 3, col = "blue")
legend(1980, 2400, c("Actual Sales", "HW Fitted MAPE = 7.37", "Forecast MAPE = 13"), lty = c(1,2,3), col = c("black", "blue", "blue"), lwd = c(2,2,3),  bty = "n")
abline(v=1994)
arrows(1980, 5000, 1994, 5000, code=3, length=0.1)
text(1985, 5300, "Training")
abline(v=1995)
arrows(1994, 5000, 1995, 5000, code=3, length=0.1)
text(1994.5, 5300, "Val")

Appendix

Information pertaining to Q.5

Linear Trend Summary

summary(DeptSales.Linear)

## 
## Call:
## tslm(formula = DeptSales.ts ~ trend)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -13897 -10094  -2577   6421  25132 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    50141       5372   9.334 4.17e-09 ***
## trend           1169        376   3.110   0.0051 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12750 on 22 degrees of freedom
## Multiple R-squared:  0.3054, Adjusted R-squared:  0.2738 
## F-statistic: 9.674 on 1 and 22 DF,  p-value: 0.005101

2nd Order Polynomial Trend Summary

summary(DeptSales.Poly)

## 
## Call:
## tslm(formula = DeptSales.ts ~ poly(trend, 2))
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -15444  -8297  -4950   5517  20416 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        64757       2597  24.937  < 2e-16 ***
## poly(trend, 2)1    39654      12722   3.117  0.00522 ** 
## poly(trend, 2)2    13313      12722   1.046  0.30724    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12720 on 21 degrees of freedom
## Multiple R-squared:  0.3398, Adjusted R-squared:  0.277 
## F-statistic: 5.405 on 2 and 21 DF,  p-value: 0.01277

5a, bullet 2

Point Forecast Deseasonalized Moving Average

MASeasForecast

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 6 Q1       5872.463 4909.405 6835.520 4399.593 7345.332
## 6 Q2       5872.463 4510.562 7234.364 3789.615 7955.311
## 6 Q3       5872.463 4204.509 7540.416 3321.548 8423.377
## 6 Q4       5872.463 3946.492 7798.433 2926.945 8817.981

Accuracy of Deseasonalized Moving Average Forecast for Dept. Store Sales

accuracy(realForecasts, salesValid)

##                ME     RMSE      MAE       MPE    MAPE       ACF1 Theil's U
## Test set 609.7874 2793.525 2030.981 0.1663322 2.38916 0.01334402 0.1790003

Point Forecast Deseasonalized and Detrended Moving Average

MASeasTrForecast

##      Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
## 6 Q1          569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q2          569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q3          569.2 -2116.935 3255.335 -3714.114 4852.514
## 6 Q4          569.2 -2116.935 3255.335 -3714.114 4852.514

Accuracy of Deseasonalized and Detrended Moving Average Forecast for Dept. Store Sales

accuracy(realForecasts1, salesValid)

##                ME    RMSE     MAE       MPE     MAPE        ACF1 Theil's U
## Test set -1948.75 2942.41 2883.35 -3.159236 4.063656 -0.07650809 0.1948234

5a, bullet 3

Accuracy of Deseasonalized (lag - 4) SES model “ANN” Forecast for Dept Store Sales data.

accuracy(realForecastsSES, salesValid)

##                ME     RMSE      MAE      MPE     MAPE       ACF1 Theil's U
## Test set 6374.794 6933.249 6374.794 8.024077 8.024077 0.01334402 0.4639576

Accuracy of raw data used in SES model “ANN” Forecast for Dept Store Sales data.

accuracy(RawForecast, salesValid)

##                    ME     RMSE      MAE      MPE     MAPE     MASE
## Training set 3046.150 13229.28 10409.41 1.213967 15.54219 3.340095
## Test set     9992.464 19370.85 13658.50 9.338171 15.28388 4.382641
##                     ACF1 Theil's U
## Training set -0.05805453        NA
## Test set      0.17229544  1.278413

5a, bullet 4

Holt’s Linear Trend Model, aka, Double Exponential Smoothing is not a suitable model for forecasting the Dept. Store Sales data because the model is not built to recognize seasonality in the data and this particular data set has a multiplicative seasonal component.

accuracy(hSales.pred, salesValid)

##                     ME     RMSE       MAE        MPE     MAPE     MASE
## Training set -628.9547 11502.29  9730.394 -4.2059461 15.31484 3.122218
## Test set     2386.1109 15706.02 12638.761 -0.7391866 15.67145 4.055434
##                    ACF1 Theil's U
## Training set -0.1328986        NA
## Test set      0.1630194 0.9830232

summary(hSales)

## ETS(M,M,N) 
## 
## Call:
##  ets(y = salesTrain, model = "MMN") 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 54381.329 
##     b = 1.0139 
## 
##   sigma:  0.1805
## 
##      AIC     AICc      BIC 
## 443.3564 447.6421 448.3350 
## 
## Training set error measures:
##                     ME     RMSE      MAE       MPE     MAPE     MASE
## Training set -628.9547 11502.29 9730.394 -4.205946 15.31484 3.122218
##                    ACF1
## Training set -0.1328986

validLength <- 4
trainLength <- length(DeptSales.ts) - validLength
salesTrain <- window(DeptSales.ts, end = c(1, trainLength))
salesValid <- window(DeptSales.ts, start = c(1, trainLength + 1))
MAwithSeasonTrend <- meanf(diff(diff(salesTrain, lag = 4), lag = 1))
MASeasTrForecast <- forecast(MAwithSeasonTrend, h = validLength)
realForecasts1 <- vector()
for (i in 1:validLength){
  if(i == 1){
    
    realForecasts1[i] <- MASeasTrForecast$mean [i]  + salesTrain[(trainLength+i)-validLength] + (salesTrain[trainLength] - salesTrain[trainLength - validLength])
  } else{
    realForecasts1[i] <- MASeasTrForecast$mean[i] + salesTrain[(trainLength+i)-validLength] + (realForecasts1[i-1] - salesTrain[trainLength+i-1-validLength])
  }
}
realForecasts1

## [1]  63982.2  68177.4  80201.6 101467.8

yrange <- range(DeptSales.ts)
plot(c(1,7), yrange, type = "n", xlab = "Year", ylab = "Dept Store Sales (thousands)", bty = "l", xaxt = "n", yaxt = "n", lwd = 2, main = "Actual and Moving Average Forecasts")
axis(1, at = seq(1, 7, 1), labels = format(seq(1, 7, 1)))
axis(2, at = seq(40000, 105000, 10000), labels = format(seq(40, 105, 10)), las = 2)
lines(DeptSales.ts, bty = "l", lwd = 2)
lines(seq(6, 7-1/validLength, 1/validLength), realForecasts1, col="blue", lwd=2, lty=3)
lines(HWforecast$mean, col = "red", lty = 2, lwd = 2)
legend(1, 100000, c("Actual Sales", "Double Differenced, Mean Forecast MAPE = 4.07", "Holt Winter's Forecast, MAPE = 0.8449629"), lty = c(1,3,2), col = c("black", "blue", "red"), lwd = c(2,2,2),  bty = "n")

Accuracy meanf double differenced

accuracy(realForecasts1, salesValid)

##                ME    RMSE     MAE       MPE     MAPE        ACF1 Theil's U
## Test set -1948.75 2942.41 2883.35 -3.159236 4.063656 -0.07650809 0.1948234

Accuracy HW

accuracy(HWforecast, salesValid)

##                     ME      RMSE       MAE        MPE      MAPE      MASE
## Training set  609.5633 1447.1798 1240.9947  0.9315256 1.9850921 0.3982014
## Test set     -366.8394  727.6403  566.4741 -0.6517749 0.8449629 0.1817661
##                     ACF1  Theil's U
## Training set 0.008503999         NA
## Test set     0.183050115 0.02380343

Pertaining to Question 8. Time series plots of Australian wines and the ets function to determine the “optimal” smoothing model and smoothing constants.

Fortified

Fortified.ts <- ts(wine$Fortified, start = c(1980, 1), frequency = 12)
FortifiedvalidLength <- 12
FortifiedtrainLength <- length(Fortified.ts) - FortifiedvalidLength
FortifiedsalesTrain <- window(Fortified.ts, end = c(1980, FortifiedtrainLength))
FortifiedsalesValid <- window(Fortified.ts, start = c(1980, FortifiedtrainLength + 1))
plot(Fortified.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Fortified Wine Sales")

Fortified <- ets(FortifiedsalesTrain, model = "ZZZ")
Fortified

## ETS(M,A,M) 
## 
## Call:
##  ets(y = FortifiedsalesTrain, model = "ZZZ") 
## 
##   Smoothing parameters:
##     alpha = 0.0496 
##     beta  = 9e-04 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 4041.6833 
##     b = -7.1612 
##     s=1.1334 1.0383 0.8836 0.9461 1.2608 1.3697
##            1.1428 1.1137 0.9614 0.8588 0.7053 0.5862
## 
##   sigma:  0.0884
## 
##      AIC     AICc      BIC 
## 2897.314 2901.188 2951.212

Red

Red.ts <- ts(wine$Red, start = c(1980, 1), frequency = 12)
RedvalidLength <- 12
RedtrainLength <- length(Red.ts) - RedvalidLength
RedsalesTrain <- window(Red.ts, end = c(1980, RedtrainLength))
RedsalesValid <- window(Red.ts, start = c(1980, RedtrainLength + 1))
plot(Red.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Red Wine Sales")

Red <- ets(RedsalesTrain, model = "ZZZ")
Red

## ETS(M,A,M) 
## 
## Call:
##  ets(y = RedsalesTrain, model = "ZZZ") 
## 
##   Smoothing parameters:
##     alpha = 0.1449 
##     beta  = 1e-04 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 932.1379 
##     b = 6.938 
##     s=1.1144 1.0194 0.9387 1.0434 1.361 1.343
##            1.0962 1.0763 0.9302 0.8465 0.7041 0.5267
## 
##   sigma:  0.1064
## 
##      AIC     AICc      BIC 
## 2728.253 2732.127 2782.152

Rose

Rose.ts <- ts(wine$Rose, start = c(1980, 1), frequency = 12)
RosevalidLength <- 12
RosetrainLength <- length(Rose.ts) - RosevalidLength
RosesalesTrain <- window(Rose.ts, end = c(1980, RosetrainLength))
RosesalesValid <- window(Rose.ts, start = c(1980, RosetrainLength + 1))
plot(Rose.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Rose Sales")

Rose <- ets(RosesalesTrain, model = "ZZZ")
Rose

## ETS(A,A,A) 
## 
## Call:
##  ets(y = RosesalesTrain, model = "ZZZ") 
## 
##   Smoothing parameters:
##     alpha = 0.0019 
##     beta  = 0.0019 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 24.9058 
##     b = 0.3992 
##     s=-14.6118 -17.8191 -7.9477 -2.2117 -7.6001 7.2329
##            7.541 -0.5594 24.8803 1.4364 2.4008 7.2584
## 
##   sigma:  25.0584
## 
##      AIC     AICc      BIC 
## 2077.871 2081.745 2131.770

Sparkling

Sparkling.ts <- ts(wine$sparkling, start = c(1980, 1), frequency = 12)
SparklingvalidLength <- 12
SparklingtrainLength <- length(Sparkling.ts) - SparklingvalidLength
SparklingsalesTrain <- window(Sparkling.ts, end = c(1980, SparklingtrainLength))
SparklingsalesValid <- window(Sparkling.ts, start = c(1980, SparklingtrainLength + 1))
plot(Sparkling.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Sparkling Sales")

Sparkling <- ets(SparklingsalesTrain, model = "ZZZ")
Sparkling

## ETS(M,N,M) 
## 
## Call:
##  ets(y = SparklingsalesTrain, model = "ZZZ") 
## 
##   Smoothing parameters:
##     alpha = 0.0636 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 2321.7212 
##     s=2.3704 1.6783 1.2386 0.8789 0.9682 0.7987
##            0.5945 0.6527 0.7302 0.7709 0.6554 0.6632
## 
##   sigma:  0.1413
## 
##      AIC     AICc      BIC 
## 2954.800 2957.800 3002.357

Sweet

Sweet.white.ts <- ts(wine$Sweet.white, start = c(1980, 1), frequency = 12)
SweetvalidLength <- 12
SweettrainLength <- length(Sweet.white.ts) - SweetvalidLength
SweetsalesTrain <- window(Sweet.white.ts, end = c(1980, SweettrainLength))
SweetsalesValid <- window(Sweet.white.ts, start = c(1980, SweettrainLength + 1))
plot(Sweet.white.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Sweet Wine Sales")

Sweet <- ets(SweetsalesTrain, model = "ZZZ")
Sweet

## ETS(M,N,M) 
## 
## Call:
##  ets(y = SweetsalesTrain, model = "ZZZ") 
## 
##   Smoothing parameters:
##     alpha = 0.4288 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 123.0858 
##     s=1.4785 1.2792 1.0335 1.012 1.0886 1.0413
##            0.7662 0.7838 0.9097 0.8779 0.8651 0.8642
## 
##   sigma:  0.1826
## 
##      AIC     AICc      BIC 
## 2236.288 2239.288 2283.845

Dry

Dry.white.ts <- ts(wine$Dry.white, start = c(1980, 1), frequency = 12)
DryvalidLength <- 12
DrytrainLength <- length(Dry.white.ts) - DryvalidLength
DrysalesTrain <- window(Dry.white.ts, end = c(1980, DrytrainLength))
DrysalesValid <- window(Dry.white.ts, start = c(1980, DrytrainLength + 1))
plot(Dry.white.ts, xlab = "Year", ylab = "Sales (thousands)", col = "blue", main = "Dry Wine Sales")

Dry <- ets(DrysalesTrain, model = "ZZZ")
Dry

## ETS(M,A,A) 
## 
## Call:
##  ets(y = DrysalesTrain, model = "ZZZ") 
## 
##   Smoothing parameters:
##     alpha = 0.0599 
##     beta  = 1e-04 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 2936.1051 
##     b = 4.5617 
##     s=1242.073 829.3723 109.0329 -94.204 391.6654 88.7661
##            -383.1171 -262.2344 -385.946 -159.6419 -449.8025 -925.9641
## 
##   sigma:  0.0998
## 
##      AIC     AICc      BIC 
## 2968.520 2972.394 3022.418

*Some things I looked at to answer 8i because I was curious.

Comparison of Fortified Wine Residuals from models with different alpha values

## ETS(M,A,M) 
## 
## Call:
##  ets(y = FortifiedsalesTrain, model = "ZZM", alpha = 0.01) 
## 
##   Smoothing parameters:
##     alpha = 0.01 
##     beta  = 0.001 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 3999.9022 
##     b = -12.6402 
##     s=1.1299 1.0399 0.8838 0.9487 1.2602 1.3725
##            1.1439 1.1118 0.9603 0.858 0.7048 0.5863
## 
##   sigma:  0.0887
## 
##      AIC     AICc      BIC 
## 2891.799 2895.221 2942.527

##                     ME     RMSE      MAE         MPE      MAPE      MASE
## Training set  22.27415 283.9810 219.2211  0.09590192  7.100508 0.8008257
## Test set     300.90845 364.6416 300.9084 11.87303212 11.873032        NA
##                     ACF1 Theil's U
## Training set  0.06948922        NA
## Test set     -0.75241127 0.6889046

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 1995       1252.988 1106.713 1399.263 1029.280 1476.696
## Feb 1995       1499.512 1324.309 1674.715 1231.563 1767.462

## ETS(M,A,M) 
## 
## Call:
##  ets(y = FortifiedsalesTrain, model = "ZZM", alpha = 0.2) 
## 
##   Smoothing parameters:
##     alpha = 0.2 
##     beta  = 0.0011 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 4039.8137 
##     b = -5.3673 
##     s=1.1192 1.0468 0.8893 0.96 1.2638 1.3653
##            1.1326 1.1144 0.9631 0.8574 0.7012 0.5869
## 
##   sigma:  0.092
## 
##      AIC     AICc      BIC 
## 2909.647 2913.069 2960.375

##                     ME     RMSE      MAE       MPE      MAPE      MASE
## Training set -17.57194 296.5439 230.0785 -1.303048  7.588995 0.8404881
## Test set     327.08136 380.8848 327.0814 12.951081 12.951081        NA
##                      ACF1 Theil's U
## Training set -0.006807277        NA
## Test set     -0.758065433 0.7205477

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 1995       1252.988 1106.713 1399.263 1029.280 1476.696
## Feb 1995       1499.512 1324.309 1674.715 1231.563 1767.462

## ETS(M,N,M) 
## 
## Call:
##  ets(y = FortifiedsalesTrain, model = "ZZM", alpha = 0.6) 
## 
##   Smoothing parameters:
##     alpha = 0.6 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 4209.5721 
##     s=1.1196 1.0385 0.8869 0.9488 1.2687 1.3684
##            1.1495 1.1255 0.9632 0.8586 0.6923 0.5799
## 
##   sigma:  0.1023
## 
##      AIC     AICc      BIC 
## 2943.293 2945.902 2987.680

##                     ME     RMSE      MAE       MPE      MAPE      MASE
## Training set -22.83443 325.7266 255.8295 -1.505787  8.554785 0.9345579
## Test set     504.23155 544.7288 504.2315 20.305433 20.305433        NA
##                   ACF1 Theil's U
## Training set -0.200693        NA
## Test set     -0.772165 0.9865081

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 1995       1252.988 1106.713 1399.263 1029.280 1476.696
## Feb 1995       1499.512 1324.309 1674.715 1231.563 1767.462

## ETS(M,N,M) 
## 
## Call:
##  ets(y = FortifiedsalesTrain, model = "ZZM", alpha = 0.98) 
## 
##   Smoothing parameters:
##     alpha = 0.98 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 4209.2442 
##     s=1.1154 1.0358 0.8828 0.9507 1.2616 1.3678
##            1.156 1.132 0.9624 0.8503 0.6962 0.589
## 
##   sigma:  0.1173
## 
##      AIC     AICc      BIC 
## 2989.718 2992.327 3034.105

##                     ME     RMSE      MAE       MPE      MAPE     MASE
## Training set -13.17981 366.9950 288.7133 -1.217969  9.807884 1.054684
## Test set     609.76039 643.3313 609.7604 24.691127 24.691127       NA
##                    ACF1 Theil's U
## Training set -0.4168382        NA
## Test set     -0.7661006  1.156751

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 1995       1252.988 1106.713 1399.263 1029.280 1476.696
## Feb 1995       1499.512 1324.309 1674.715 1231.563 1767.462

Do residuals behave the same for each month of the year?

par(oma = c(0, 0, 0, 2))
xrange <- c(1,18)
yrange <- range(FortifiedSmooth$residuals)
plot(xrange, yrange, type="n", xlab="Year", ylab="Residuals", bty="n", las=1, xaxt = "n", yaxt = "n")
colors <- rainbow((15))

axis(1, at=seq(1,15,1), labels=format(seq(1,15,1)))
axis(2, at = seq(-0.2, 0.4, 0.05), labels = format(seq(-2000, 4000, 500)))

for (i in 1:12) { 
  currentQu <- subset(FortifiedSmooth$residuals, cycle(FortifiedSmooth$residuals)==i)
  lines(seq(1, 1 +length(currentQu)-1,1), currentQu, type="l", lwd=1, col=colors[i]) 
} 
title("Fortified Wine Sales Broken out by Month")
legend(16, .3, 1:12, cex=0.8, col=colors
       #, pch=plotchar
       , lty=linetype
       , title="Month", xpd=NA)

Looking at Fortified Wine Residuals one year at a time.

barplot(FortifiedSmooth$residuals[1:12], col = "blue", bty = "l")

barplot(FortifiedSmooth$residuals[13:24], col = "red", bty = "l")

barplot(FortifiedSmooth$residuals[25:36], col = "blue")

barplot(FortifiedSmooth$residuals[37:48], col = "red", bty = "l")

barplot(FortifiedSmooth$residuals[49:60], col = "blue")

barplot(FortifiedSmooth$residuals[61:72], col = "red", bty = "l")

barplot(FortifiedSmooth$residuals[73:84], col = "blue")

barplot(FortifiedSmooth$residuals[85:96], col = "red", bty = "l")

barplot(FortifiedSmooth$residuals[97:108], col = "blue")

barplot(FortifiedSmooth$residuals[109:120], col = "red", bty = "l")

barplot(FortifiedSmooth$residuals[121:132], col = "blue")

barplot(FortifiedSmooth$residuals[133:144], col = "red", bty = "l")

barplot(FortifiedSmooth$residuals[145:156], col = "blue")

barplot(FortifiedSmooth$residuals[157:168], col = "red", bty = "l")

barplot(FortifiedSmooth$residuals[169:176], col = "blue")

Tracking Fortified Wine Residuals by month

ggseasonplot(FortifiedSmooth$residuals)

Smoothing

Christine Iyer

March 27, 2017

2.

5.

5a.

8

Appendix

Pertaining to Question 8. Time series plots of Australian wines and the ets function to determine the “optimal” smoothing model and smoothing constants.