\(\textit{P.1 Would you consider neural networks for this task?}\)

Yes, I will consider neural networking for this task.because the performance of neural networking is more beneficial with high-frequency series such as hourly, daily or weekly data, compared to low-frequency series and the task of forecasting Australian Wine Sales in figure 6.15 shows high-frequency performances because time plot is monthly sales of six type of wine.and another reason for consideration using this method is we asked to obtain short-term forecast (2-3) months the will give the best forecast.

=============================================================================================================== \(\textit{P.2 Use neural networks to forecast fortified wine sales, as follows:}\)

\(\textit{∙ Partition the data using the period until December 1993 as the training period.}\)

\(\textit{∙ Run a neural network using R’s 𝚗𝚗𝚎𝚝𝚊𝚛nnetar with 11 non-seasonal lags (i.e., p=11p=11). Leave all other arguments at their default.}\)

library(forecast)
## Warning: package 'forecast' was built under R version 3.2.5
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 3.2.5
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: timeDate
## This is forecast 7.3
setwd("/Users/yusufsultan/rWork")
AusWinSales <- read.csv("AustralianWines.csv")
str(AusWinSales)
## 'data.frame':    180 obs. of  7 variables:
##  $ Month      : Factor w/ 180 levels "Apr-80","Apr-81",..: 61 46 106 1 121 91 76 16 166 151 ...
##  $ Fortified  : int  2585 3368 3210 3111 3756 4216 5225 4426 3932 3816 ...
##  $ Red        : int  464 675 703 887 1139 1077 1318 1260 1120 963 ...
##  $ Rose       : Factor w/ 94 levels "*","100","101",..: 12 16 22 94 14 36 16 22 38 30 ...
##  $ sparkling  : int  1686 1591 2304 1712 1471 1377 1966 2453 1984 2596 ...
##  $ Sweet.white: int  85 89 109 95 91 95 96 128 124 111 ...
##  $ Dry.white  : int  1954 2302 3054 2414 2226 2725 2589 3470 2400 3180 ...
head(AusWinSales)
##    Month Fortified  Red Rose sparkling Sweet.white Dry.white
## 1 Jan-80      2585  464  112      1686          85      1954
## 2 Feb-80      3368  675  118      1591          89      2302
## 3 Mar-80      3210  703  129      2304         109      3054
## 4 Apr-80      3111  887   99      1712          95      2414
## 5 May-80      3756 1139  116      1471          91      2226
## 6 Jun-80      4216 1077  168      1377          95      2725
tail(AusWinSales)
##      Month Fortified  Red Rose sparkling Sweet.white Dry.white
## 175 Jul-94      2714 3670    *      2031         225      3905
## 176 Aug-94      2294 2665    *      1495         205      3670
## 177 Sep-94      2416 2639   46      2968         259      4221
## 178 Oct-94      2016 2226   51      3385         254      4404
## 179 Nov-94      2799 2586   63      3729         275      5086
## 180 Dec-94      2467 2684   84      5999         394      5725
# Create a time series object out of it
AusWinSalesTS <- ts(AusWinSales$Fortified,start=c(1980,1),frequency=12)
yrange = range(AusWinSalesTS)
# Set up the plot
plot(c(1980,1995),yrange,type="n",xlab="Year",ylab="Fortified Wine Sales (thousands of liters)",bty="l",xaxt="n",yaxt="n")
# Add the time series sales
lines(AusWinSalesTS,bty="l",lwd=2)
# Add the x-axis
axis(1,at=seq(1980,1995,1),labels=format(seq(1980,1995,1)))
# Add the y-axis
axis(2,at=seq(1000,6000,500),labels=format(seq(1000,6000,500)),las=2)

First, we need to partition the data into training and validation sets. Then we can fit a neural network to the training set. We can then use that fitted model to forecast for the validation period.The duration set is 12 months and the training set 14 years

# Set the validation and training period lengths
validLength <- 12
trainLength <- length(AusWinSalesTS) - validLength
# Partition the time series into training and validation periods.
AusWinSalesTrain <- window(AusWinSalesTS, end=c(1980,trainLength))
AusWinSalesValid <- window(AusWinSalesTS, start=c(1980,trainLength+1))

\(\textit{neural network using R’s 𝚗𝚗𝚎𝚝𝚊𝚛nnetar with 11 non-seasonl lags}\)

#set random number generator seed same as the R code in the textbook page 200 to get the same results if we run it again
set.seed(8373493)
# Use nnetar to fit the neural network.
# using the parameter p=11   
#This parameter will include 11 non-seasonal lags
# By default it will use P=1 (1 seasonal lag)
AusWinSalesNN <- nnetar(AusWinSalesTrain,p=11)
# A neural network with 11 non-seasonal lags is obtained to be
AusWinSalesNN
## Series: AusWinSalesTrain 
## Model:  NNAR(11,1,6)[12] 
## Call:   nnetar(y = AusWinSalesTrain, p = 11)
## 
## Average of 20 networks, each of which is
## a 12-6-1 network with 85 weights
## options were - linear output units 
## 
## sigma^2 estimated as 5734
# the weights for the first of the 20 neural networks
# By default repeats=20 in the call above
# Changing the 1 below to 2 would give the weights of the 2nd neural network 
summary(AusWinSalesNN$model[[1]])
## a 12-6-1 network with 85 weights
## options were - linear output units 
##   b->h1  i1->h1  i2->h1  i3->h1  i4->h1  i5->h1  i6->h1  i7->h1  i8->h1 
##   -0.14    1.60   -1.88    0.65    4.43   -2.46   -0.67    0.05   -3.61 
##  i9->h1 i10->h1 i11->h1 i12->h1 
##   -2.41    1.52    2.98    1.65 
##   b->h2  i1->h2  i2->h2  i3->h2  i4->h2  i5->h2  i6->h2  i7->h2  i8->h2 
##    0.44   -0.83    1.07    0.71    0.73    0.00    0.66    0.38   -1.07 
##  i9->h2 i10->h2 i11->h2 i12->h2 
##    0.09    0.00    0.09   -1.09 
##   b->h3  i1->h3  i2->h3  i3->h3  i4->h3  i5->h3  i6->h3  i7->h3  i8->h3 
##    1.21    3.92   -0.71    1.73   -2.67    1.49    0.36   -0.03    3.92 
##  i9->h3 i10->h3 i11->h3 i12->h3 
##    0.30   -3.39    1.96   -5.85 
##   b->h4  i1->h4  i2->h4  i3->h4  i4->h4  i5->h4  i6->h4  i7->h4  i8->h4 
##   -2.17    3.83    1.14    0.28   -1.49   -0.99    3.75    2.98   -4.88 
##  i9->h4 i10->h4 i11->h4 i12->h4 
##   -0.52   -1.34    0.72   -1.59 
##   b->h5  i1->h5  i2->h5  i3->h5  i4->h5  i5->h5  i6->h5  i7->h5  i8->h5 
##   -0.89    1.48   -1.48   -0.52   -1.33    0.44   -0.22   -0.45    0.73 
##  i9->h5 i10->h5 i11->h5 i12->h5 
##   -0.91    0.29    0.24   -0.83 
##   b->h6  i1->h6  i2->h6  i3->h6  i4->h6  i5->h6  i6->h6  i7->h6  i8->h6 
##   -0.65   -1.81   -0.12    0.84    0.43   -0.82   -2.19   -1.92    3.76 
##  i9->h6 i10->h6 i11->h6 i12->h6 
##   -0.03   -0.85    0.71   -0.97 
##  b->o h1->o h2->o h3->o h4->o h5->o h6->o 
##  1.98  1.59 -2.75  1.39 -1.27 -1.82 -2.04
#without the set.seed()
AusWinSalesNN <- nnetar(AusWinSalesTrain,p=11)
# A neural network with 11 non-seasonal lags is obtained to be
AusWinSalesNN
## Series: AusWinSalesTrain 
## Model:  NNAR(11,1,6)[12] 
## Call:   nnetar(y = AusWinSalesTrain, p = 11)
## 
## Average of 20 networks, each of which is
## a 12-6-1 network with 85 weights
## options were - linear output units 
## 
## sigma^2 estimated as 6002

sigmea estimated different

\(\textit{a. Create a time plot for the actual and forecasted series over the training period.}\) \(\textit{Create also a time plot of the forecast errors for the training period. Interpret what you see in the plots.}\)

=======

\(\textit{Graphically Compare the Two Methods}\)

#Make predictions for the next 12 months
AusWinSalesPred <- forecast(AusWinSalesNN,h=validLength)
#Set up the plot
plot(c(1980,1994),yrange,type="n",xlab="Year",ylab="Fortified Wine Sales (thousands of liters)",bty="l",xaxt="n",yaxt="n")

#Add the time series sales
lines(AusWinSalesTrain,bty="l",lwd=2)
#Add the forecasts from the neural network
lines(AusWinSalesPred$fitted,col="Red",lty=2,lwd=3)
#Add the x-axis
axis(1,at=seq(1980,1994,1),labels=format(seq(1980,1994,1)))
#Add the y-axis
axis(2,at=seq(1000,6000,500),labels=format(seq(1000,6000,500)),las=2)
#Add the legend
legend(1989,6000,c("Actuals","Neural Network"),lty=c(1,2),col=c("black","Red"),lwd=c(1,2),bty="n")

#The residual plot over the training period
plot(AusWinSalesPred$residuals,bty="l",ylab="Residuals",lwd=2)

\(\textit{The neural network forecasts fit the training period very well.}\)

=================

\(\textit{b. Use the neural network to forecast sales for each month in the validation period (January 1994 to December 1994).}\)

\(\textit{The neural network forecasts for the validation period.}\)

AusWinSalesPred$mean
##           Jan      Feb      Mar      Apr      May      Jun      Jul
## 1994 1528.648 1290.953 1712.994 2097.373 2228.495 2338.716 2530.600
##           Aug      Sep      Oct      Nov      Dec
## 1994 2491.237 2140.755 1711.550 2131.377 2455.862
# Generate the accuracy metrics
accuracy(AusWinSalesPred,AusWinSalesValid)
##                       ME      RMSE       MAE        MPE      MAPE
## Training set   0.1229099  77.46941  60.77207 -0.2297735  2.203368
## Test set     194.9531926 336.99451 290.26738  6.9342431 13.778098
##                   MASE         ACF1 Theil's U
## Training set 0.2184031 -0.007469794        NA
## Test set     1.0431651 -0.115409002 0.7651973
\(\textit{We see that the accuracy metrics MAPE is higher on both the training set and the validation set 2.1% and for validation is 10.9%}\)

==========================================================================================================

\(\textit{Compare your neural network to an exponential smoothing model used to forecast fortified wine sales.}\)

\(\textit{Use R’s 𝚎𝚝𝚜ets function to automatically select and fit an exponential smoothing model to the training period until}\) \(\textit{December 1993. Which model did 𝚎𝚝𝚜ets fit?}\)

AusWinSalesETS <- ets(AusWinSalesTrain,model="ZZZ",restrict=FALSE)
AusWinSalesETS
## ETS(M,A,M) 
## 
## Call:
##  ets(y = AusWinSalesTrain, model = "ZZZ", restrict = FALSE) 
## 
##   Smoothing parameters:
##     alpha = 0.0555 
##     beta  = 9e-04 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 4040.0811 
##     b = -6.7983 
##     s=1.1316 1.0399 0.8877 0.9505 1.2722 1.3862
##            1.1463 1.1097 0.9345 0.8513 0.6996 0.5903
## 
##   sigma:  0.0859
## 
##      AIC     AICc      BIC 
## 2755.038 2759.118 2808.145

\(\textit{We see that it found an ETS(M,A,M) model as the best fit.It is a multiplicative Holt-Winters’ method with multiplicative errors and additive trend}\)

======

\(\textit{b. Use this exponential smoothing model to forecast sales for each month in 1994.}\) \(\textit{The exponential smoothing model to forecast sales for each month in 1994.}\)

# Generate the forecasts
AusWinSalesForecast <- forecast(AusWinSalesETS,h=validLength)
AusWinSalesForecast
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 1994       1289.829 1147.913 1431.745 1072.788 1506.871
## Feb 1994       1521.475 1353.802 1689.148 1265.041 1777.909
## Mar 1994       1842.645 1639.237 2046.054 1531.559 2153.732
## Apr 1994       2013.011 1790.409 2235.614 1672.571 2353.452
## May 1994       2379.117 2115.554 2642.679 1976.033 2782.201
## Jun 1994       2445.906 2174.435 2717.376 2030.728 2861.083
## Jul 1994       2943.532 2616.195 3270.870 2442.913 3444.151
## Aug 1994       2688.471 2388.895 2988.047 2230.309 3146.633
## Sep 1994       1998.782 1775.592 2221.971 1657.443 2340.120
## Oct 1994       1857.773 1649.880 2065.666 1539.829 2175.717
## Nov 1994       2165.635 1922.749 2408.521 1794.173 2537.097
## Dec 1994       2344.995 2081.384 2608.606 1941.836 2748.153

\(\textit{c. How does neural network compare to the exponential smoothing model in terms of predictive }\) \(\textit{performance in the training period? In the validation period?}\)

# Accuracy metrics
accuracy(AusWinSalesForecast,AusWinSalesValid)
##                     ME     RMSE      MAE       MPE      MAPE      MASE
## Training set -25.32466 287.8687 224.6507 -1.317643  7.229271 0.8073515
## Test set     125.56906 328.9246 256.3940  4.443793 10.858860 0.9214307
##                     ACF1 Theil's U
## Training set  0.05168201        NA
## Test set     -0.01105575 0.7140459

If we take a look at the graph we can tell that the neural network forecasts for the training set outperform the exponential smoothing forecasts. the output of the neural network shows that the RMSE is 75.71998 ,compared to exponential smoothing RMSE of 287.8687, and a MAPE of 2.097739 compared to exponential smoothing MAPE of 7.229271. both methods the MAPE for the neural network and exponential smoothing to be nearly identical.

plot(c(1994,1995),c(0,3500),type="n",xlab="The Year 1994",ylab="Fortified Wine Sales (thousands of liters)",bty="l",xaxt="n",yaxt="n")
lines(AusWinSalesValid,bty="l",lwd=2)
lines(AusWinSalesPred$mean,col="blue",lwd=3)
lines(AusWinSalesForecast$mean,col="red",lty=2,lwd=2)
axis(1, at=seq(1994,1995,1/11), labels=c("Jan","Feb","Mar","Apr","May","Jun","Jul", "Aug","Sep","Oct","Nov","Dec"))
axis(2,at=seq(0,3500,500),labels=format(seq(0,3500,500)),las=2)
legend(1994.5,1500,c("Actuals","Neural Network","Exp Smoothing"),lty=c(1,1,2),col=c("black","blue","red"),lwd=c(1,2,1),bty="n")

From both the accuracy measure and the graphical method we can obtain some results .in accuracy measures we can see that there are some differents values for RMSE and MAPE, but it is not a big different those value with Graphically Compare the Two Methods both the Neural Network and expo Smoothing they look similar.