knitr::opts_chunk$set(echo = TRUE)

Assignment 4, Problem 2

Relationship between Moving Average and Exponential Smoothing:

Assume that we apply a moving average to a series, using a very short window span. If we wanted to achieve an equivalent result using simple exponential smoothing, what value would the smoothing constant take?

In section 5.4, the formula for determining the moving average window (w) from a simple exponential smoothing alpha is:

   w =  2/alpha - 1

Solving for alpha gives the formula:

  alpha = 2/(w+1)

It should also be noted that as the moving average window, w, gets smaller, the exponential smoothing value alpha gets closer to 1. Conversely, as the exponential smoothing value of alpha gets closer to 0, the moving average window, w, gets larger.

Assignment 4, Problem 5

Forecasting Department Store Sales

The file DepartmentStoreSales.xls contains data on the quarterly sales for a department store over a 6-year period.

#  Load libraries and set environment options
library(dplyr)
library(tidyr)
library(knitr)
library(readxl)
library(forecast)
library(ggplot2)
library(zoo)

#  Use this option to supress scientific notation in printing values
options(scipen = 10, digits = 2)

a. Which of the following methods would not be suitable for forecasting this series. Explain why or why not for each one.

We start by decomposing the time series to get a better understand the data.
First, plot the time series, then run decompose.

#  Decompose and plot the time series for initial analysis
dec_DSSales <- decompose(DSSales.ts)
yrange=range(DSSales.ts)
plot(c(1,6), yrange, type="n", xlab="YEAR", ylab="Sales in $10K", bty="l", yaxt="n", main = "Department Store Sales")
lines(DSSales.ts, bty="l")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))
axis(2, at=seq(50000, 110000, 20000), labels=format(seq(50, 110,20)), las=2)

#autoplot(DSSales.ts, bty="l")
autoplot(dec_DSSales)

Decomposition shows that the time series is additive, seasonal, and has an upward trend.

Now, consider each of the suggested forecasting methods:

Moving average of raw series
Moving average would not be a good choice for forecasting since it does not forecast seasonality. It would show the trend, but not seasonality.

Moving average of de-seasonalized series
As stated for the moving average of the raw series, de-seasonalizing the series would be better for showing the trend, but seasonality would be lost.

Simple exponential smoothing of the raw series
Simple exponential smoothing does not work with trends or seasonality, so it would not be a good choice.

Double exponential smoothing of the raw series
Double exponential smoothing, also known as Holt’s linear trend model, would better handle the upward trend, but not the seasonality.

Holt-Winter’s exponential smoothing of the raw series
Holt-Winter’s exponential smoothing would be the best choice of all those listed here. It can handle both trend and seasonality.

b. A forecaster was tasked to generate forecasts for 4 quarters ahead. She therefore partitioned the data so that the last 4 quarters were designated as the validation period. The forecaster approached the forecasting task by using multiplicative Holt-Winter’s exponential smoothing. Specifically, you should call the hw function with the parameter seasonal=“multiplicative”. Let the method pick the appropriate parameters for alpha, beta, and gamma.

i. Run this method on the data. Request the forecasts on the validation period. (Note that the forecasted values for the validation set will be different than what the book shows.)

ii. Using the forecasts for the validation set that you came up with in i. above, compute the MAPE values for the forecasts of quarters 21-22.

   Quarter  Actual    Forecast    Error  MAPE  
   21        60800      61335     -535    .88  
   22        64900      64971     -71     .11

#  Create Training files, then run naive, seasonal naive, and mean forecasts
valid_length<-4
train_length<-length(DSSales.ts)-valid_length
DSSales_train<-window(DSSales.ts, start=c(1,1), end=c(5,4))
DSSales_valid <- window(DSSales.ts, start=c(6,1), end=c(6,4))
DSSales_HWfc <- hw(DSSales_train, seasonal = "multiplicative", h=4)
summary(DSSales_HWfc)

## 
## Forecast method: Holt-Winters' multiplicative method
## 
## Model Information:
## Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = DSSales_train, h = 4, seasonal = "multiplicative") 
## 
##   Smoothing parameters:
##     alpha = 0.4032 
##     beta  = 0.1429 
##     gamma = 0.4549 
## 
##   Initial states:
##     l = 57401.8119 
##     b = 605.4045 
##     s=1.3 0.98 0.86 0.86
## 
##   sigma:  0.026
## 
##  AIC AICc  BIC 
##  372  390  381 
## 
## Error measures:
##               ME RMSE MAE  MPE MAPE MASE   ACF1
## Training set 247 1500 978 0.35  1.7 0.31 -0.079
## 
## Forecasts:
##      Point Forecast Lo 80  Hi 80 Lo 95  Hi 95
## 6 Q1          61335 59303  63367 58228  64442
## 6 Q2          64971 62529  67413 61237  68706
## 6 Q3          76718 73377  80059 71608  81828
## 6 Q4          99421 94372 104469 91700 107141

#head(DSSales_valid)
print("Accuracy from Holt-Winter's Forecast")

## [1] "Accuracy from Holt-Winter's Forecast"

accuracy(DSSales_HWfc, DSSales_valid)

##               ME RMSE  MAE  MPE MAPE MASE    ACF1 Theil's U
## Training set 247 1500  978 0.35  1.7 0.31 -0.0788        NA
## Test set     897 1982 1200 0.79  1.3 0.39  0.0095      0.13

c. The fit and the residuals were displayed in the book. Please reproduce them with R code. Using all the information from (b) and your generated figures, is this model suitable for forecasting quarters 21 and 22?

Holt-Winter’s forecast with fit

#  Plot the forecasts 
autoplot(DSSales_train,  xlab="Year - Quarter", ylab="Sales in $10K") + autolayer(DSSales_HWfc$fitted, series = "Fitted") + autolayer(DSSales_valid, series = "Validation") + autolayer(DSSales_HWfc, series = "Forecast")

### Holt-Winter’s forecast residuals

#  Plot the residuals
plot(DSSales_HWfc$residuals, col=2, lwd=2, bty="l",xant="n", xlab="Year - Quarter", ylab="Forecast Error", yaxt="n")
axis(2, las=2)

d. Another analyst decided to take a much simpler approach, and instead of using exponential smoothing he used differencing. Use differencing to remove the trend and seasonal pattern. Which order works better: first removing trend and then seasonality or the opposite order? Show the progression of time plots as you difference the data and each final series to provide evidence in support of your answer.

#  Use Differencing method
# Set up the plot window (2 x 2)
par(mfrow = c(2,3))
# Plot the original time series
plot(DSSales.ts, xlab="Year - Quarter", xant="n", ylab="Sales in $10K", yaxt="n", bty="l", main="Department Store Sales")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))
axis(2, at=seq(50000, 110000, 20000), labels=format(seq(50, 110,20)), las=2)

# Plot the lag-4 time series
plot(diff(DSSales.ts, lag=4), xlab="Year - Quarter", xant="n", ylab="LAG - 4", bty="l", main="Lag - 4 Difference")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))

# Plot the lag-1 difference
plot(diff(DSSales.ts, lag=1), xlab="Year - Quarter", xant="n", ylab="LAG - 4", bty="l", main="Lag - 1 Difference")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))

# Plot the lag-4, then lag-1 difference
DSSales_lag41 <- diff(diff(DSSales.ts, lag=4), lag=1)
plot(DSSales_lag41, xlab="Year - Quarter", xant="n", ylab="lag 4, then 1", bty="l", main="Lag 4, then Lag 1")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))

# Plot the lag 1, then lag 4 difference
DSSales_lag14 <- diff(diff(DSSales.ts, lag=1), lag=4)
plot(DSSales_lag14, xlab="Year - Quarter", xant="n", ylab="lag 1, then 4", bty="l", main="Lag 1, then Lag 4")
axis(1, at=seq(1,6,1), labels=format(seq(1,6,1)))

Examining the charts for lag 4, then lag 1 and lag 1, then lag 4, they are identical. It apears there is no difference in sequence.

e. Forecast quarters 21-22 using the average of the double-differenced series from (d). Remember to use only the training period (until quarter 20), and to adjust back for the trend and seasonal pattern.

Differenced forecast for quartes 21 & 22

# Create a forecast using the double differenced data
valid_length<-2
old_valid_length<-4
train_length<-length(DSSales.ts)-old_valid_length
DSSales_train<-window(DSSales.ts, start=c(1,1), end=c(5,4))
DSSales_valid2 <- window(DSSales.ts, start=c(6,1), end=c(6,2))

DSSales_dfc <- meanf(diff(diff(DSSales_train, lag=4), lag=1), h=2)

print("Differenced forecast for quartes 21 & 22")

## [1] "Differenced forecast for quartes 21 & 22"

DSSales_dfc

##      Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 6 Q1            569 -2117  3255 -3714  4853
## 6 Q2            569 -2117  3255 -3714  4853

f. Compare the forecasts from (e) to the exponential smoothing forecasts found in (b). Which of the two forecasting methods would you choose? Explain.

De-differenced forecast for quartes 21 & 22

# De-difference the forecast

ddif_DSSales_dfc <-vector()
for (i in 1:valid_length) {
    if(i == 1) {
      ddif_DSSales_dfc[i] <- DSSales_dfc$mean[i] + DSSales_train[(train_length+i)-old_valid_length] + (DSSales_train[train_length] - DSSales_train[train_length - old_valid_length])
    }
  else {
     ddif_DSSales_dfc[i] <- DSSales_dfc$mean[i] + DSSales_train[(train_length+i)-old_valid_length] + (ddif_DSSales_dfc[i-1] - DSSales_train[train_length+i-1-old_valid_length])
  }
}

#print("De-differenced forecast for quartes 21 & 22")
ddif_DSSales_dfc

## [1] 63982 68177

print("Validation set for quartes 21 & 22")

## [1] "Validation set for quartes 21 & 22"

DSSales_valid2

##    Qtr1  Qtr2
## 6 60800 64900

print("Check accuracy of De-differenced forecast")

## [1] "Check accuracy of De-differenced forecast"

accuracy(ddif_DSSales_dfc, DSSales_valid2)

##             ME RMSE  MAE  MPE MAPE ACF1 Theil's U
## Test set -3230 3230 3230 -5.1  5.1 -0.5       0.8

The De-differenced forecast is NOT a good model for this time series.

g. What is an even simpler approach that should be compared as a baseline? Complete that comparison.

An even simler approach would be to use the ets function Ets will compute the correct model based on its determination of trend and seasonality.

# Create a histogram of the forecast errors
DSSales_ETS <- ets(DSSales_train)
summary(DSSales_ETS)

## ETS(M,N,M) 
## 
## Call:
##  ets(y = DSSales_train) 
## 
##   Smoothing parameters:
##     alpha = 0.9907 
##     gamma = 0.0001 
## 
##   Initial states:
##     l = 59697.8836 
##     s=1.3 1 0.86 0.84
## 
##   sigma:  0.025
## 
##  AIC AICc  BIC 
##  368  377  375 
## 
## Training set error measures:
##               ME RMSE  MAE  MPE MAPE MASE  ACF1
## Training set 546 1597 1338 0.82  2.1 0.43 -0.23

DSSales_ETSFC <- forecast(DSSales_ETS, h=4)
DSSales_ETSFC

##      Point Forecast Lo 80 Hi 80 Lo 95  Hi 95
## 6 Q1          59250 57310 61189 56284  62216
## 6 Q2          60933 58125 63741 56638  65227
## 6 Q3          70335 66370 74299 64272  76398
## 6 Q4          92197 86200 98194 83026 101368

accuracy(DSSales_ETSFC, DSSales_valid)

##                ME RMSE  MAE  MPE MAPE MASE  ACF1 Theil's U
## Training set  546 1597 1338 0.82  2.1 0.43 -0.23        NA
## Test set     5830 6831 5830 7.02  7.0 1.87  0.21      0.48

Not as good a forecast as the double exponential smoothing.

Assignment 4, Problem 8

Forecasting Australian Wine Sales

Figure 5.14 shows time plots of monthly sales of six types of Australian wines (red, rose, sweet white, dry white, sparkling, and fortified) for 1980-1994. Data available in AustralianWines.xls. The units are thousands of liters. You are hired to obtain short-term forecasts (2-3 months ahead) for each of the six series, and this task will be repeated every month.

a) Which smoothing method would you choose if you had to choose the same method for forecasting all series? Why?

Given that forecasting needs to be done on products that although similar (all wines), each could have very different trends and seasonality. If the forecasting has to be repeated every month, and a single method must be used for all, I would use the ETS method. The ETS function selects the “best fit” forecast model for the given data. Then the forecast can be run using that model.

b) Fortified wine has the largest market share of the six types of wine. You are asked to focus on fortified wine sales alone and produce as accurate a forecast as possible for the next two months.

i. Start by partitioning the data using the period until Dec - 1993 as the training period.

ii. Apply Hold-Winter’s exponential smoothing (with multiplicative seasonality) to sales.

First, create the fortified wine sales time series, and decompose the time series to get an initial look at the data.

Fortified wine Sales

#  Create a Time Series and Line Model of the data
WineSales.ts<-ts(WineSales$Fortified, start=c(1980,1),end=c(1984,4), frequency = 12)
#WineSales_lm<-tslm(WineSales.ts ~ trend+I(trend^2))

#  Decompose and plot the time series for initial analysis
dec_WineSales <- decompose(WineSales.ts)
yrange=range(WineSales.ts)
plot(c(1980, 1984), yrange, type="n", xlab="YEAR", xaxt="n", ylab="Sales in Thousans of Liters", bty="l", yaxt="n")
lines(WineSales.ts, bty="l", main="Fortified Wine Sales")

axis(1, at=seq(1980,1984, 1), labels=format(seq(1980,1984, 1)))
axis(2, at=seq(1500, 6000, 500), las=2)

autoplot(dec_WineSales)

Holt-Winter’s exponential smoothing forecast

#  Create Training files and validation files
valid_length<-4
train_length<-length(WineSales.ts)-valid_length
WineSales_train<-window(WineSales.ts, start=c(1080,1), end=c(1983,12))
WineSales_valid <- window(WineSales.ts, start=c(1984,1), end=c(1984,4))

#Apply Hold-Winter's exponential smoothing (with multiplicative seasonality) to sales
WineSales_HWfc <- hw(WineSales_train, seasonal = "multiplicative", h=4)
summary(WineSales_HWfc)

## 
## Forecast method: Holt-Winters' multiplicative method
## 
## Model Information:
## Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = WineSales_train, h = 4, seasonal = "multiplicative") 
## 
##   Smoothing parameters:
##     alpha = 0.0374 
##     beta  = 0.0001 
##     gamma = 0.0001 
## 
##   Initial states:
##     l = 3905.7063 
##     b = -7.0187 
##     s=0.99 1 0.9 1 1.4 1.4
##            1.2 1.1 0.93 0.82 0.75 0.62
## 
##   sigma:  0.072
## 
##  AIC AICc  BIC 
##  754  774  786 
## 
## Error measures:
##                ME RMSE MAE   MPE MAPE MASE ACF1
## Training set -8.4  279 218 -0.74  5.9 0.61  0.2
## 
## Forecasts:
##          Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 1984           2210  2007  2413  1899  2520
## Feb 1984           2668  2422  2913  2292  3043
## Mar 1984           2896  2629  3162  2488  3303
## Apr 1984           3261  2960  3561  2801  3720

head(WineSales_valid)

##       Jan  Feb  Mar  Apr
## 1984 1954 2604 3626 2836

accuracy(WineSales_HWfc, WineSales_valid)

##                ME RMSE MAE   MPE MAPE MASE  ACF1 Theil's U
## Training set -8.4  279 218 -0.74  5.9 0.61  0.20        NA
## Test set     -3.4  443 369 -2.59 12.7 1.03 -0.43      0.55

c) Create a plot for the residuals from the Hold-Winter’s exponential smoothing.

#  Plot the forecasts 
autoplot(WineSales_train,  xlab="Year", ylab="Sales in Thousans of Liters") + autolayer(WineSales_HWfc$fitted, series = "Fitted") + autolayer(WineSales_valid, series = "Validation") + autolayer(WineSales_HWfc, series = "Forecast")

#  Plot the residuals
plot(WineSales_HWfc$residuals, col=2, lwd=2, bty="l",xant="n", xlab="Year", ylab="Forecast Error", yaxt="n")
axis(2, las=2)

i. Based on this plot, which of the following statements are reasonable? . Decembers (month 12) are not captured well by the model. . There is a strong correlation between sales on the same calendar month. . The model does not capture the seasonality well. . We should first deseasonalize the data and then apply Hold-Winter’s exponential smoothing.

I would agree with the statements: - There is a strong correlation between sales in the same calendar month
- The model does not handle seasonality well

ii. How can you handle the above effect with exponential smoothing?

A couple of options, use diff and de-diff to create a forecast without seasonality and then add seasonality back in. Or, try using ETS.

MBA678 - Predictive Analytics

Helen Smith

February 26, 2018