1. Would you consider neural networks for this task? Explain why.
Even though neural network outputs are difficult to explain, I would consider using a neural network because the outputs can produce forecasts at different time horizons, which would work for what is being requested of the wine sales. The short term wine sale forecasts are easy to produce via a neural network.
2. Use neural networks to forecast fortified wine sales, as follows: - Partition the data using the period until December 1993 as the training period. - Run a neural network using R’s nnetar with 11 nonseasonal lags (i.e., p = 11). Leave all other arguments at their default.
setwd("C:/Users/larms.LA-INSP5559/Documents/R/win-library/3.3/17_0429assignment8")
WineSales<- read.csv("awines2.csv", stringsAsFactors = FALSE)
head(WineSales)
## Month Fortified
## 1 1/1/1980 2585
## 2 2/1/1980 3368
## 3 3/1/1980 3210
## 4 4/1/1980 3111
## 5 5/1/1980 3756
## 6 6/1/1980 4216
str(WineSales)
## 'data.frame': 180 obs. of 2 variables:
## $ Month : chr "1/1/1980" "2/1/1980" "3/1/1980" "4/1/1980" ...
## $ Fortified: int 2585 3368 3210 3111 3756 4216 5225 4426 3932 3816 ...
tail(WineSales, 20)
## Month Fortified
## 161 5/1/1993 2329
## 162 6/1/1993 2660
## 163 7/1/1993 2923
## 164 8/1/1993 2626
## 165 9/1/1993 2132
## 166 10/1/1993 1772
## 167 11/1/1993 2526
## 168 12/1/1993 2755
## 169 1/1/1994 1154
## 170 2/1/1994 1568
## 171 3/1/1994 1965
## 172 4/1/1994 2659
## 173 5/1/1994 2354
## 174 6/1/1994 2592
## 175 7/1/1994 2714
## 176 8/1/1994 2294
## 177 9/1/1994 2416
## 178 10/1/1994 2016
## 179 11/1/1994 2799
## 180 12/1/1994 2467
library(forecast)
winesalests<- ts(WineSales$Fortified, start = c(1980, 1), frequency = 12)
#Partition the data
winevalidationlen<- 12
winetraininglen<- length(winesalests)-winevalidationlen
winesalestrain<- window(winesalests, end = c(1980, winetraininglen))
winesalesvalid<- window(winesalests, start = c(1980, winetraininglen + 1))
#plot the series
plot(winesalests, xlab = "Year", ylab = "Wine Sales (Thousands)", main = "Fortified Wine Sales")
#run a neural network
set.seed(8373493)
wineNN<- nnetar(winesalestrain, p=11)
wineNN
## Series: winesalestrain
## Model: NNAR(11,1,6)[12]
## Call: nnetar(y = winesalestrain, p = 11)
##
## Average of 20 networks, each of which is
## a 12-6-1 network with 85 weights
## options were - linear output units
##
## sigma^2 estimated as 5734
summary(wineNN$model[[1]])
## a 12-6-1 network with 85 weights
## options were - linear output units
## b->h1 i1->h1 i2->h1 i3->h1 i4->h1 i5->h1 i6->h1 i7->h1 i8->h1
## -0.14 1.60 -1.88 0.65 4.43 -2.46 -0.67 0.05 -3.61
## i9->h1 i10->h1 i11->h1 i12->h1
## -2.41 1.52 2.98 1.65
## b->h2 i1->h2 i2->h2 i3->h2 i4->h2 i5->h2 i6->h2 i7->h2 i8->h2
## 0.44 -0.83 1.07 0.71 0.73 0.00 0.66 0.38 -1.07
## i9->h2 i10->h2 i11->h2 i12->h2
## 0.09 0.00 0.09 -1.09
## b->h3 i1->h3 i2->h3 i3->h3 i4->h3 i5->h3 i6->h3 i7->h3 i8->h3
## 1.21 3.92 -0.71 1.73 -2.67 1.49 0.36 -0.03 3.92
## i9->h3 i10->h3 i11->h3 i12->h3
## 0.30 -3.39 1.96 -5.85
## b->h4 i1->h4 i2->h4 i3->h4 i4->h4 i5->h4 i6->h4 i7->h4 i8->h4
## -2.17 3.83 1.14 0.28 -1.49 -0.99 3.75 2.98 -4.88
## i9->h4 i10->h4 i11->h4 i12->h4
## -0.52 -1.34 0.72 -1.59
## b->h5 i1->h5 i2->h5 i3->h5 i4->h5 i5->h5 i6->h5 i7->h5 i8->h5
## -0.89 1.48 -1.48 -0.52 -1.33 0.44 -0.22 -0.45 0.73
## i9->h5 i10->h5 i11->h5 i12->h5
## -0.91 0.29 0.24 -0.83
## b->h6 i1->h6 i2->h6 i3->h6 i4->h6 i5->h6 i6->h6 i7->h6 i8->h6
## -0.65 -1.81 -0.12 0.84 0.43 -0.82 -2.19 -1.92 3.76
## i9->h6 i10->h6 i11->h6 i12->h6
## -0.03 -0.85 0.71 -0.97
## b->o h1->o h2->o h3->o h4->o h5->o h6->o
## 1.98 1.59 -2.75 1.39 -1.27 -1.82 -2.04
(a) Create a time plot for the actual and forecasted series over the training period. Create also a time plot of the forecast errors for the training period. Interpret what you see in the plots.
set.seed(8373493)
winepred<- forecast(wineNN, h = winevalidationlen)
winepred
## Jan Feb Mar Apr May Jun Jul
## 1994 1319.548 1407.169 1855.154 2049.701 2132.175 2376.346 2758.660
## Aug Sep Oct Nov Dec
## 1994 2581.042 2139.324 1798.984 2306.188 2608.601
yrange = range(winesalests)
plot(c(1980, 1994), yrange, type = "n", xlab = "Year", ylab = "Wine Sales", main = "Actual & Forecasted Wine Sales", bty = "l", xaxt = "n", yaxt = "n")
lines(winesalestrain, col = "blue", lwd = 2)
lines(wineNN$fitted, col ="red", lwd = 1)
axis(1, at = seq(1980, 1994, 1))
axis(2, at = seq(1000, 6000, 500), labels = format(seq(1000,6000, 500)))
plot(wineNN$residuals, main = "Residuals of NN Model", col = "purple")
abline(h = 0)
There aren’t many errors or deviations from the actuals to the NN model. It seems like it has overfit the model.
(b) Use the neural network to forecast sales for each month in the validation period (January 1994 to December 1994).
winepred
## Jan Feb Mar Apr May Jun Jul
## 1994 1319.548 1407.169 1855.154 2049.701 2132.175 2376.346 2758.660
## Aug Sep Oct Nov Dec
## 1994 2581.042 2139.324 1798.984 2306.188 2608.601
accuracy(winepred)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.09094804 75.71998 57.53445 -0.2129987 2.097739 0.2067678
## ACF1
## Training set -0.009408954
yrange = range(winesalests)
plot(c(1980, 1994), yrange, type = "n", xlab = "Year", ylab = "Wine Sales", main = "Actual & Forecasted Wine Sales", bty = "l", xaxt = "n", yaxt = "n")
lines(winesalestrain, col = "blue", lwd = 2)
lines(winepred$fitted, col ="red", lwd = 1)
axis(1, at = seq(1980, 1994, 1))
axis(2, at = seq(1000, 6000, 500), labels = format(seq(1000,6000, 500)))
3. Compare your neural network to an exponential smoothing model used to forecast fortified wine sales.
(a) Use R’s ets function to automatically select and fit an exponential smoothing model to the training period until December 1993. Which model did ets fit?
winesalesETS<- ets(winesalestrain, model = "ZZZ", restrict = FALSE)
winesalesETS
## ETS(M,A,M)
##
## Call:
## ets(y = winesalestrain, model = "ZZZ", restrict = FALSE)
##
## Smoothing parameters:
## alpha = 0.0555
## beta = 9e-04
## gamma = 1e-04
##
## Initial states:
## l = 4040.0811
## b = -6.7983
## s=1.1316 1.0399 0.8877 0.9505 1.2722 1.3862
## 1.1463 1.1097 0.9345 0.8513 0.6996 0.5903
##
## sigma: 0.0859
##
## AIC AICc BIC
## 2755.038 2759.118 2808.145
accuracy(winesalesETS)
## ME RMSE MAE MPE MAPE MASE
## Training set -25.32466 287.8687 224.6507 -1.317643 7.229271 0.8073515
## ACF1
## Training set 0.05168201
ETS fit the Holt-Winter’s model.
(b) Use this exponential smoothing model to forecast sales for each month in 1994.
winesalesETSpred<- forecast(winesalesETS, winevalidationlen)
winesalesETSpred
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 1994 1289.829 1147.913 1431.745 1072.788 1506.871
## Feb 1994 1521.475 1353.802 1689.148 1265.041 1777.909
## Mar 1994 1842.645 1639.237 2046.054 1531.559 2153.732
## Apr 1994 2013.011 1790.409 2235.614 1672.571 2353.452
## May 1994 2379.117 2115.554 2642.679 1976.033 2782.201
## Jun 1994 2445.906 2174.435 2717.376 2030.728 2861.083
## Jul 1994 2943.532 2616.195 3270.870 2442.913 3444.151
## Aug 1994 2688.471 2388.895 2988.047 2230.309 3146.633
## Sep 1994 1998.782 1775.592 2221.971 1657.443 2340.120
## Oct 1994 1857.773 1649.880 2065.666 1539.829 2175.717
## Nov 1994 2165.635 1922.749 2408.521 1794.173 2537.097
## Dec 1994 2344.995 2081.384 2608.606 1941.836 2748.153
(c) How does the neural network compare to the exponential smoothing model in terms of predictive performance in the training period? In the validation period?
accuracy(winepred, winesalesvalid)
## ME RMSE MAE MPE MAPE
## Training set -0.09094804 75.71998 57.53445 -0.2129987 2.097739
## Test set 138.75901697 289.14325 245.23423 5.1737224 10.881000
## MASE ACF1 Theil's U
## Training set 0.2067678 -0.009408954 NA
## Test set 0.8813246 -0.040116767 0.6427012
accuracy(winesalesETSpred, winesalesvalid)
## ME RMSE MAE MPE MAPE MASE
## Training set -25.32466 287.8687 224.6507 -1.317643 7.229271 0.8073515
## Test set 125.56906 328.9246 256.3940 4.443793 10.858860 0.9214307
## ACF1 Theil's U
## Training set 0.05168201 NA
## Test set -0.01105575 0.7140459
After reviewing the MAPE values, it looks like the NN model forecasted much better than the ETS model considering the distance between the MAPE values in the training and test sets.