Chapter 9

Question 1

I would consider usual neural networks for this task. Neural networks are helpful when you are hoping to just generate accurate forecasts and are not concerned with descriptive or explanatory methods; we are really just looking for forecasted values in this example. It also has monthly data for many years, which is relatively short-term - this is where neural networks seem to function best.

Question 2

To run a neural network on the data, I divided the data (as requested) into training and validation periods and then utilized nnetar to generate the neural network.

wine <- read.csv("~/MBA678/AustralianWines.csv", stringsAsFactors = FALSE)
winesalesTS <- ts(wine$Fortified,start=c(1980,1),frequency=12)
plot(winesalesTS)

validLength <- 12
trainLength <- length(winesalesTS) - validLength
wineTrain <- window(winesalesTS, end=c(1980,trainLength))
wineValid <- window(winesalesTS, start=c(1980,trainLength+1))

set.seed(8373493)
wineNN <- nnetar(wineTrain, p=11)

I then used this neural network to create a plot comparing the actual values to forecasted values over the training period. I also plotted the errors from the training period.

wineforecast <- forecast(wineNN,h=validLength)
plot(c(1980,1994),c(1000,6000), type="n",xlab="Year",ylab="Wine Sales (Thousands of Liters)",bty="l",xaxt="n",yaxt="n")

lines(wineTrain,bty="l")
lines(wineforecast$fitted,col="red")

axis(1,at=seq(1980,1994,1),labels=format(seq(1980,1994,1)))
axis(2,at=seq(1000,6000,500),labels=format(seq(1000,6000,500)),las=2)
legend(1989,6000,c("Actuals","Neural Network"),lty=c(1,2),col=c("black","red"),lwd=c(1,2),bty="n")

plot(wineforecast$residuals,bty="l",ylab="Residuals")

Based on the plots, it is easy to see the neural network was very similar to the actual training data.

I then used the neural network to predict sales in the validation period.

wineforecast$mean
##           Jan      Feb      Mar      Apr      May      Jun      Jul
## 1994 1319.548 1407.169 1855.154 2049.701 2132.175 2376.346 2758.660
##           Aug      Sep      Oct      Nov      Dec
## 1994 2581.042 2139.324 1798.984 2306.188 2608.601

Question 3

I ran the ets function to see which model it would fit to the training period of the fortified wine sales, by using model=“ZZZ”.

wineETS <- ets(wineTrain,model="ZZZ",restrict=FALSE)

wineETS
## ETS(M,A,M) 
## 
## Call:
##  ets(y = wineTrain, model = "ZZZ", restrict = FALSE) 
## 
##   Smoothing parameters:
##     alpha = 0.0555 
##     beta  = 9e-04 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 4040.0811 
##     b = -6.7983 
##     s=1.1316 1.0399 0.8877 0.9505 1.2722 1.3862
##            1.1463 1.1097 0.9345 0.8513 0.6996 0.5903
## 
##   sigma:  0.0859
## 
##      AIC     AICc      BIC 
## 2755.038 2759.118 2808.145

The model fit is shown above as ets(M,A,M), meaning multiplicative error and seasonality with additive trend. I used this to then forecast the validation period sales.

wineETSfore <- forecast(wineETS, h=validLength)

wineETSfore$mean
##           Jan      Feb      Mar      Apr      May      Jun      Jul
## 1994 1289.829 1521.475 1842.645 2013.011 2379.117 2445.906 2943.532
##           Aug      Sep      Oct      Nov      Dec
## 1994 2688.471 1998.782 1857.773 2165.635 2344.995

To compare the exponential smoothing model and the neural network, I wanted to compare the accuracy for both models, as well as plot them to visualize the differences. I did this for both the training and validation periods.

#Accuracy calculations for exponential smoothing model - Valid
accuracy(wineETSfore, wineValid)
##                     ME     RMSE      MAE       MPE      MAPE      MASE
## Training set -25.32466 287.8687 224.6507 -1.317643  7.229271 0.8073515
## Test set     125.56906 328.9246 256.3940  4.443793 10.858860 0.9214307
##                     ACF1 Theil's U
## Training set  0.05168201        NA
## Test set     -0.01105575 0.7140459
#Accuracy calculations for neural network - Valid
accuracy(wineforecast, wineValid)
##                        ME      RMSE       MAE        MPE      MAPE
## Training set  -0.09094802  75.71998  57.53445 -0.2129987  2.097739
## Test set     138.75901696 289.14325 245.23423  5.1737224 10.881000
##                   MASE         ACF1 Theil's U
## Training set 0.2067678 -0.009408954        NA
## Test set     0.8813246 -0.040116768 0.6427012

Using the accuracy output for both, you can see the neural network was a much better fit to the training data than the ets method, with lower error scores across the board. However on the test set (validation period), the two methods are more similar, with a MAPE of 10.8 for the ets and 11.3 for the neural network. In fact, the ME was slightly higher for the test set with the neural network.

I then plotted the training period data and in a seperate plot, the validation data in comparison to the two methods.

plot(c(1980,1994),c(1000,6000), type="n",xlab="Year",ylab="Wine Sales (Thousands of Liters)",bty="l",xaxt="n",yaxt="n")

lines(wineTrain,bty="l")
lines(wineforecast$fitted,col="red")
lines(wineETSfore$fitted, col="blue")

axis(1,at=seq(1980,1994,1),labels=format(seq(1980,1994,1)))
axis(2,at=seq(1000,6000,500),labels=format(seq(1000,6000,500)),las=2)
legend(1989,6000,c("Actuals","Neural Network", "ets Method"),lty=c(1,2,3),col=c("black","red","blue"),lwd=c(1,2,3),bty="n")

plot(c(1994,1995),c(1000,3500), type="n",xlab="Month",ylab="Wine Sales (Thousands of Liters)",bty="l",xaxt="n",yaxt="n")

lines(wineValid,bty="l")
lines(wineforecast$mean,col="red")
lines(wineETSfore$mean, col="blue")

axis(1, at=seq(1994,1995,1/11), labels=c("Jan","Feb","Mar","Apr","May","Jun","Jul", "Aug","Sep","Oct","Nov","Dec"))
axis(2,at=seq(1000,3500,500),labels=format(seq(1000,3500,500)),las=2)
legend(1994.6,1800,c("Actuals","Neural Network", "ets Method"),lty=c(1,2,3),col=c("black","red","blue"),lwd=c(1,2,3),bty="n")

Based on the plots, you can see they are both reasonably accurate, similar to the results we saw in the accuracy measures.