Business Problem

Quarterly beer sales data has been provided in the beer.csv files. Our Aim is “Using the Winter-Holts methods and model the data and predict for the next 2 years” We also need to have analysis “Using the ARIMA method model the data and predict for the next 2 years”.

Load required libraries

library('ggplot2') # visualization
library('ggthemes') # visualization
## Warning: package 'ggthemes' was built under R version 3.4.3
library('scales') # visualization
library('forecast')
## Warning: package 'forecast' was built under R version 3.4.3
library('TSA')
## Warning: package 'TSA' was built under R version 3.4.3
## Loading required package: leaps
## Warning: package 'leaps' was built under R version 3.4.3
## Loading required package: locfit
## Warning: package 'locfit' was built under R version 3.4.3
## locfit 1.5-9.1    2013-03-22
## Loading required package: mgcv
## Warning: package 'mgcv' was built under R version 3.4.3
## Loading required package: nlme
## Warning: package 'nlme' was built under R version 3.4.3
## 
## Attaching package: 'nlme'
## The following object is masked from 'package:forecast':
## 
##     getResponse
## This is mgcv 1.8-23. For overview type 'help("mgcv-package")'.
## Loading required package: tseries
## Warning: package 'tseries' was built under R version 3.4.3
## 
## Attaching package: 'TSA'
## The following objects are masked from 'package:stats':
## 
##     acf, arima
## The following object is masked from 'package:utils':
## 
##     tar
library('tseries')
library('caret')
## Warning: package 'caret' was built under R version 3.4.3
## Loading required package: lattice
library('TeachingDemos')
## Warning: package 'TeachingDemos' was built under R version 3.4.3

Load the files for analysis and do the basic data type analysis

beer_data <- read.csv('beer.csv')
str(beer_data)
## 'data.frame':    72 obs. of  1 variable:
##  $ OzBeer: num  284 213 227 308 262 ...
nrow(beer_data)
## [1] 72

Convert the data into time series one

The dataset does not have any column stating year and month of data colelction. Starting year been assumed as this is not shared as part of group assignment.

Time Series Analysis

Time Series analysis have following key steps: –Visualize the time series –Stationarize the time series by extracting cycles and trend –Plot ACF/PACF chart and find out optimal parameters –Build ARIMA/Winter-Holls method –Make prediction

Visualize the time serie

beer_timeseries <- ts(beer_data, frequency = 4, start = c(2000,1))
beer_timeseries
##       Qtr1  Qtr2  Qtr3  Qtr4
## 2000 284.4 212.8 226.9 308.4
## 2001 262.0 227.9 236.1 320.4
## 2002 271.9 232.8 237.0 313.4
## 2003 261.4 226.8 249.9 314.3
## 2004 286.1 226.5 260.4 311.4
## 2005 294.7 232.6 257.2 339.2
## 2006 279.1 249.8 269.8 345.7
## 2007 293.8 254.7 277.5 363.4
## 2008 313.4 272.8 300.1 369.5
## 2009 330.8 287.8 305.9 386.1
## 2010 335.2 288.0 308.3 402.3
## 2011 352.8 316.1 324.9 404.8
## 2012 393.0 318.9 327.0 442.3
## 2013 383.1 331.6 361.4 445.9
## 2014 386.6 357.2 373.6 466.2
## 2015 409.6 369.8 378.6 487.0
## 2016 419.2 376.7 392.8 506.1
## 2017 458.4 387.4 426.9 525.0
summary(beer_timeseries)
##      OzBeer     
##  Min.   :212.8  
##  1st Qu.:272.6  
##  Median :317.5  
##  Mean   :329.9  
##  3rd Qu.:379.7  
##  Max.   :525.0
plot.ts(beer_timeseries, col = "blue", main = "Beer sales time series data between 2000 and 2017")

### Plotting of time series data

Data is then plotted with time series with abline to understand if the data has any pattern. The abline is a simple linear regression line. Most time series patterns can be described in terms of two basic classes of components: trend and seasonality. The former represents a general systematic linear. The latter may have a formally similar nature however, it repeats itself in systematic intervals over time.

Key Observations are: –The year on year trend clearly shows that the Beer sales have been increasing without fail. –There has been a trend (increasing) the Beer sales –Cycle affect seems to be there in Beer sales –Yearly Beer sales also confirm the increasing trend

plot.ts(beer_timeseries, col = "blue", main = "Beer sales time series data between 2000 and 2017")
abline(reg=lm(beer_timeseries ~ time(beer_timeseries)),col = "lightgray")## plotting the trend line of linear regression

cycle(beer_timeseries)
##      Qtr1 Qtr2 Qtr3 Qtr4
## 2000    1    2    3    4
## 2001    1    2    3    4
## 2002    1    2    3    4
## 2003    1    2    3    4
## 2004    1    2    3    4
## 2005    1    2    3    4
## 2006    1    2    3    4
## 2007    1    2    3    4
## 2008    1    2    3    4
## 2009    1    2    3    4
## 2010    1    2    3    4
## 2011    1    2    3    4
## 2012    1    2    3    4
## 2013    1    2    3    4
## 2014    1    2    3    4
## 2015    1    2    3    4
## 2016    1    2    3    4
## 2017    1    2    3    4
Annual_beerSales = aggregate(beer_timeseries)
Annual_beerSales
## Time Series:
## Start = 2000 
## End = 2017 
## Frequency = 1 
##       OzBeer
##  [1,] 1032.5
##  [2,] 1046.4
##  [3,] 1055.1
##  [4,] 1052.4
##  [5,] 1084.4
##  [6,] 1123.7
##  [7,] 1144.4
##  [8,] 1189.4
##  [9,] 1255.8
## [10,] 1310.6
## [11,] 1333.8
## [12,] 1398.6
## [13,] 1481.2
## [14,] 1522.0
## [15,] 1583.6
## [16,] 1645.0
## [17,] 1694.8
## [18,] 1797.7
plot.ts(Annual_beerSales, col = "blue", main = "Yearly Beer sales time series data between 2000 and 2017")

#Box plot across months will give us a sense on seasonal effect Key Observations are: –The variance and the mean value in Q1 and Q4 is much higher than rest of the quarters. –Q2 Sales seems to be down compare to other Quarters

boxplot(beer_timeseries~cycle(beer_timeseries))

We also should confirm the periocity. It is vital to use MA of same size of seasonality. Following command helps to us to detect the seasonality. Output shows that2nd highest peak 105280.96889 happening at frequency 0.25000000. So the frequency here is 1/0.25 =4

beer_p= periodogram(beer_data)

freq_data <- beer_p$freq
spec_data <- beer_p$spec
beer_frequency_data <- data.frame(spec = spec_data, freq = freq_data)
beer_frequency_data[order(beer_frequency_data$spec,decreasing = TRUE),]
##            spec       freq
## 1  175272.30161 0.01388889
## 18 105280.96889 0.25000000
## 2   34268.64255 0.02777778
## 3   14843.72910 0.04166667
## 4    9816.18095 0.05555556
## 36   6647.04500 0.50000000
## 6    5389.78598 0.08333333
## 5    5274.61065 0.06944444
## 7    4352.18762 0.09722222
## 8    3307.81504 0.11111111
## 9    2256.01035 0.12500000
## 19   1765.39101 0.26388889
## 13   1264.46901 0.18055556
## 10   1135.73976 0.13888889
## 11   1082.72208 0.15277778
## 16   1081.98783 0.22222222
## 22   1027.41668 0.30555556
## 23    980.79809 0.31944444
## 25    910.48728 0.34722222
## 12    816.53250 0.16666667
## 35    778.78799 0.48611111
## 20    745.87478 0.27777778
## 21    733.89045 0.29166667
## 26    688.64887 0.36111111
## 34    492.15126 0.47222222
## 29    469.66728 0.40277778
## 15    467.48768 0.20833333
## 14    396.35255 0.19444444
## 31    340.26193 0.43055556
## 27    270.47965 0.37500000
## 17    251.69544 0.23611111
## 30    244.71847 0.41666667
## 28    206.40677 0.38888889
## 33    161.00277 0.45833333
## 24    140.17528 0.33333333
## 32     67.42963 0.44444444

Stationarize the Series

The interactions between trend and seasonality are typically classified as either additive or multiplicative. In an additive time series, the components add together to make the time series. If we have an increasing trend, we will still see roughly the same size peaks and troughs throughout the time series. If we have an increasing trend, the amplitude of seasonal activity increases. Everything becomes more exaggerated and so then it will be taken as multiplicative model.

Note that the peak height of of the graph in the sesonal component is almost constant and hence our assumption of additive model is validated Additive: xt = Trend + Seasonal + Random

decompose_beer_data <- decompose(beer_timeseries,type = "additive")
seasonal_beer <- as.ts(decompose_beer_data$seasonal)
trend_beer <- as.ts(decompose_beer_data$trend)
random_beer <- as.ts(decompose_beer_data$random)
plot.ts(seasonal_beer, main = "Seasonal Component")

plot.ts(trend_beer, main = "Trend Component")

plot.ts(random_beer, main = "Randon Component")

plot.ts(beer_timeseries,main = "Beer sales time series data between 2000 and 2017")

#### Interpreting the decomposed time series data The plot shows the observed series, the smoothed trend line, the seasonal pattern and the random part of the series. Note that the seasonal pattern is a regularly repeating pattern. The numerical output is shown below. The elements of figure are the effects for the four quarters. Note that the seasonal effect values are repeated each year (row) in the $seasonal object at the bottom of this output. The estimated seasonal factors are given for four quarters. The largest seasonal factor is Q4 and the lowest is for Q2 , indicating that there seems to be a peak in demand in Q4 and a trough in Q2 each year.

plot (decompose(beer_timeseries, type="additive"))

decompose_beer_data
## $x
##       Qtr1  Qtr2  Qtr3  Qtr4
## 2000 284.4 212.8 226.9 308.4
## 2001 262.0 227.9 236.1 320.4
## 2002 271.9 232.8 237.0 313.4
## 2003 261.4 226.8 249.9 314.3
## 2004 286.1 226.5 260.4 311.4
## 2005 294.7 232.6 257.2 339.2
## 2006 279.1 249.8 269.8 345.7
## 2007 293.8 254.7 277.5 363.4
## 2008 313.4 272.8 300.1 369.5
## 2009 330.8 287.8 305.9 386.1
## 2010 335.2 288.0 308.3 402.3
## 2011 352.8 316.1 324.9 404.8
## 2012 393.0 318.9 327.0 442.3
## 2013 383.1 331.6 361.4 445.9
## 2014 386.6 357.2 373.6 466.2
## 2015 409.6 369.8 378.6 487.0
## 2016 419.2 376.7 392.8 506.1
## 2017 458.4 387.4 426.9 525.0
## 
## $seasonal
##            Qtr1       Qtr2       Qtr3       Qtr4
## 2000   7.896324 -40.678676 -24.650735  57.433088
## 2001   7.896324 -40.678676 -24.650735  57.433088
## 2002   7.896324 -40.678676 -24.650735  57.433088
## 2003   7.896324 -40.678676 -24.650735  57.433088
## 2004   7.896324 -40.678676 -24.650735  57.433088
## 2005   7.896324 -40.678676 -24.650735  57.433088
## 2006   7.896324 -40.678676 -24.650735  57.433088
## 2007   7.896324 -40.678676 -24.650735  57.433088
## 2008   7.896324 -40.678676 -24.650735  57.433088
## 2009   7.896324 -40.678676 -24.650735  57.433088
## 2010   7.896324 -40.678676 -24.650735  57.433088
## 2011   7.896324 -40.678676 -24.650735  57.433088
## 2012   7.896324 -40.678676 -24.650735  57.433088
## 2013   7.896324 -40.678676 -24.650735  57.433088
## 2014   7.896324 -40.678676 -24.650735  57.433088
## 2015   7.896324 -40.678676 -24.650735  57.433088
## 2016   7.896324 -40.678676 -24.650735  57.433088
## 2017   7.896324 -40.678676 -24.650735  57.433088
## 
## $trend
##          Qtr1     Qtr2     Qtr3     Qtr4
## 2000       NA       NA 255.3250 254.4125
## 2001 257.4500 260.1000 262.8375 264.6875
## 2002 265.4125 264.6500 262.4625 260.4000
## 2003 261.2625 262.9875 266.1875 269.2375
## 2004 270.5125 271.4625 272.1750 274.0125
## 2005 274.3750 277.4500 278.9750 279.1750
## 2006 282.9000 285.2875 287.9375 290.3875
## 2007 291.9625 295.1375 299.8000 304.5125
## 2008 309.6000 313.1875 316.1250 320.1750
## 2009 322.7750 325.5750 328.2000 328.7750
## 2010 329.1000 331.4250 335.6500 341.3625
## 2011 346.9500 349.3375 354.6750 360.0500
## 2012 360.6625 365.6125 369.0625 369.4125
## 2013 375.3000 380.0500 380.9375 384.5750
## 2014 389.3000 393.3625 398.7750 403.2250
## 2015 405.4250 408.6500 412.4500 414.5125
## 2016 417.1500 421.3125 428.6000 434.8375
## 2017 440.4375 447.0625       NA       NA
## 
## $random
##              Qtr1         Qtr2         Qtr3         Qtr4
## 2000           NA           NA  -3.77426471  -3.44558824
## 2001  -3.34632353   8.47867647  -2.08676471  -1.72058824
## 2002  -1.40882353   8.82867647  -0.81176471  -4.43308824
## 2003  -7.75882353   4.49117647   8.36323529 -12.37058824
## 2004   7.69117647  -4.28382353  12.87573529 -20.04558824
## 2005  12.42867647  -4.17132353   2.87573529   2.59191176
## 2006 -11.69632353   5.19117647   6.51323529  -2.12058824
## 2007  -6.05882353   0.24117647   2.35073529   1.45441176
## 2008  -4.09632353   0.29117647   8.62573529  -8.10808824
## 2009   0.12867647   2.90367647   2.35073529  -0.10808824
## 2010  -1.79632353  -2.74632353  -2.69926471   3.50441176
## 2011  -2.04632353   7.44117647  -5.12426471 -12.68308824
## 2012  24.44117647  -6.03382353 -17.41176471  15.45441176
## 2013  -0.09632353  -7.77132353   5.11323529   3.89191176
## 2014 -10.59632353   4.51617647  -0.52426471   5.54191176
## 2015  -3.72132353   1.82867647  -9.19926471  15.05441176
## 2016  -5.84632353  -3.93382353 -11.14926471  13.82941176
## 2017  10.06617647 -18.98382353           NA           NA
## 
## $figure
## [1]   7.896324 -40.678676 -24.650735  57.433088
## 
## $type
## [1] "additive"
## 
## attr(,"class")
## [1] "decomposed.ts"

Seasonality adjusted

We can seasonally adjust the time series by estimating the seasonal component, and subtracting the estimated seasonal component from the original time series. We can see that the seasonal variation has been removed from the seasonally adjusted time series. The seasonally adjusted time series now just contains the trend component and an irregular component. Data has become much smoother after removing the seasonality.

beer_timeseries_seasonallyadjusted <- beer_timeseries - seasonal_beer
plot.ts(beer_timeseries_seasonallyadjusted, main = "Seasonal Adjusted Plot of Beer sales")

### Forcasting applying Winter-Holts methods Additive Holt-Winters method can be applied to forecast future sales since the magnitude of seasonal span is almost constant as the level of the time series increases. The Holt-Winters seasonal method comprises the forecast equation and three smoothing equations: One for the level, one for trend, and one for the seasonal component, with smoothing parameters alpha, beta and gamma. Here we are plotting using standard prediction method. Sigma/Residual value is here 9.738 which could not be fit by the model. This is is error term. Seasonal components are calculated as 55.077, -23.9983,-37.4121 and 6.3335 for Q1, Q2,Q3 and Q4.

The output tells us that the estimated value of the alpha parameter is about 0.0395, which is very close to zero.This means that the forecasts are based on both recent and less recent observations, though there is more weight placed on recent observations.

Plot of Holt-Winter prediction has shaded and blue portion. where the dark gray is the 80% confidence interval, the light gray is the 95% confidence interval, and the blue line are the actual predictions.

To test the accuracy, we can calculate the SSE for the in-sample forecast errors.

beer_timeseries_hw <- hw(beer_timeseries,seasonal="additive", h=8)
beer_timeseries_hw$model
## Holt-Winters' additive method 
## 
## Call:
##  hw(y = beer_timeseries, h = 8, seasonal = "additive") 
## 
##   Smoothing parameters:
##     alpha = 0.0395 
##     beta  = 0.0395 
##     gamma = 0.1854 
## 
##   Initial states:
##     l = 258.2137 
##     b = 0.0239 
##     s=55.077 -23.9983 -37.4121 6.3335
## 
##   sigma:  9.738
## 
##      AIC     AICc      BIC 
## 653.6687 656.5719 674.1587
plot(beer_timeseries_hw$model)

plot(beer_timeseries_hw)

accuracy(beer_timeseries_hw)
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 1.776628 9.737973 8.174546 0.4602816 2.523016 0.6036806
##                    ACF1
## Training set -0.1821053

Analysis of Residuals

It is quite clear than mean of Resoduals is approximately zero and fluctuating over mean of zero.

To test the validity of this model we should also examine the correlations between the forecast errors. If correlation exists in the error terms, it is likely that the simple exponential smoothing forecasts could be improved upon by another technique. All the lines are well within blue hashed line.

plot(beer_timeseries_hw$residuals)
abline(0, 0) 

acf(beer_timeseries_hw$residuals)

Predict method is being used for next two quaterter value prediction. Note that hw method already does the forcast and it is not necessary to call the predict or forcast method for future value prediction.

beer_timeseries_predict <- predict(beer_timeseries_hw, n.ahead = 8, prediction.interval = TRUE)
beer_timeseries_predict$model
## Holt-Winters' additive method 
## 
## Call:
##  hw(y = beer_timeseries, h = 8, seasonal = "additive") 
## 
##   Smoothing parameters:
##     alpha = 0.0395 
##     beta  = 0.0395 
##     gamma = 0.1854 
## 
##   Initial states:
##     l = 258.2137 
##     b = 0.0239 
##     s=55.077 -23.9983 -37.4121 6.3335
## 
##   sigma:  9.738
## 
##      AIC     AICc      BIC 
## 653.6687 656.5719 674.1587
plot(beer_timeseries_predict$model)

Using forcast method and we have predicted below.Note that hw method already does the forcast and it is not necessary to call the predict or forcast method for future value prediction.

beer_timeseries_predict_fc <- forecast(beer_timeseries_hw,h=8)
beer_timeseries_predict_fc$model
## Holt-Winters' additive method 
## 
## Call:
##  hw(y = beer_timeseries, h = 8, seasonal = "additive") 
## 
##   Smoothing parameters:
##     alpha = 0.0395 
##     beta  = 0.0395 
##     gamma = 0.1854 
## 
##   Initial states:
##     l = 258.2137 
##     b = 0.0239 
##     s=55.077 -23.9983 -37.4121 6.3335
## 
##   sigma:  9.738
## 
##      AIC     AICc      BIC 
## 653.6687 656.5719 674.1587

We will compare the prediction generated by predict and forcast method

par(mfrow=c(2,2))
plot.ts(beer_timeseries_predict, main = "Forcasted sales by predict method", col="red" )
plot.ts(beer_timeseries_predict_fc, main = "Forcasted sales by forcast method", col="green" )

Using ARIMA Model for prediction

Test for stationary of time series

KPSS Test gures out if a time series is stationary around a mean or linear trend, or is non-stationary due to a unit root. A stationary time series is one where statistical properties - like the mean and variance - are constant over time. The null hypothesis for the test is that the data is stationary and alternate hypothesis for the test is that the data is not stationary. p-value derived is 0.01 and this is less 0.05 (with 95% of confidence level) and so our Null hypothesis is rejected. Major disadvatnage of KPSS test is that it suffers from high type-1 error and hence it is recommended to validate the result always with adf.test

ADF Test - The null hypothesis for the test is that the data is stationary and alternate hypothesis is that the data is not stationary. Here p-value is > 0.5 and hence we failed to reject the null hypothesis and our data is stationary.

kpss.test(beer_timeseries)
## Warning in kpss.test(beer_timeseries): p-value smaller than printed p-value
## 
##  KPSS Test for Level Stationarity
## 
## data:  beer_timeseries
## KPSS Level = 3.0456, Truncation lag parameter = 1, p-value = 0.01
adf.test(beer_timeseries)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  beer_timeseries
## Dickey-Fuller = -0.73068, Lag order = 4, p-value = 0.9633
## alternative hypothesis: stationary

We need to estimate number of “seasonal difference” to be made to make the time series de-stattionary and then do the differentiation. Here the lag value is set as frequency to indicate that corresponding seasonal values to be differentiated with difference level of 1.

beer_data_nsdiff <- nsdiffs(beer_timeseries)  # estimate the number of differences required to make a given time series stationaryestimates. the number of seasonal differences.
beer_data_differ <- diff(beer_timeseries, lag = frequency(beer_timeseries),differences = beer_data_nsdiff)
kpss.test(beer_data_differ)  ## checking if TS is stationary or not..look at p-value...here on difference data and p value <0.05 and so data is not stationary
## Warning in kpss.test(beer_data_differ): p-value smaller than printed p-
## value
## 
##  KPSS Test for Level Stationarity
## 
## data:  beer_data_differ
## KPSS Level = 1.3005, Truncation lag parameter = 1, p-value = 0.01
plot.ts(beer_data_differ, col="blue", main = "After seasonal differencing")
abline(reg=lm(beer_data_differ~time(beer_data_differ)),col="red")

No of first difference to be made to make time series stationary. This value seems to be 1. KPSS Test now return 0.1 which is > 0.05 and so we failed to reject Null hypothesis. Here we the trend line is flat and we dont have almost any trend. This means trend has been eliminated due to differentiation.

beer_data_differ_nd <- ndiffs(beer_data_differ)
beer_data_differ_final <- diff(beer_data_differ, lag = frequency(beer_data_differ),differences = beer_data_differ_nd)
kpss.test(beer_data_differ_final)
## Warning in kpss.test(beer_data_differ_final): p-value greater than printed
## p-value
## 
##  KPSS Test for Level Stationarity
## 
## data:  beer_data_differ_final
## KPSS Level = 0.038513, Truncation lag parameter = 1, p-value = 0.1
plot.ts(beer_data_differ_final, col="blue", main = "After Time Series Stationary")
abline(reg=lm(beer_data_differ_final~time(beer_data_differ_final)),col="red")

### ARIMA Model coefficient calcultaion.

ARIMA model has Auto regressive coefficient (p) which is obtained from PACF, MA Process coefficient (q) is obtained ACF.

ACF Chart In the ACF chart if the autocorelation crosses the dashed blue line then it means that specific lag is significantly related with current series. We will need to take value q as 1 because after 1 lag the autocorelation is becoming under dashed line. As expected 1st lag has highest co-relation.

PACF Chart It is used to find out value of p which will be used. This will be used for AR(p) calculation. Here we will take value of lag as 2 and hence p will be taken as 2.

So our factors are p=2 and q as 1. We either take p or q and will then compare the AIC value.

beer_acf <- acf(beer_data_differ_final)  ## see the t-1 has highest corelation and as we move on in lag the coreltion reduced.

plot(beer_acf, main = "ACF Plot")

beer_pacf <- pacf(beer_data_differ_final)  ## partial auto corelation of X(t) and others

plot(beer_pacf, main = "PACF Plot")

### ARIMA Modelling with selected coefficient

We will use auto.arima which will run iteratively to find out optimal values of p,d and q. Best model will be selected based on AIC value and our optimal model is Best model: ARIMA(2,1,1)(0,1,1)[4]

beerfit_m <- auto.arima(beer_timeseries,p=0, d=1, max.d =2, max.p=2, max.q= 2, max.P=2, max.Q=2, max.D=2, start.p =0, start.q=0, start.P=0, start.Q=0, stepwise=TRUE, trace=TRUE)
## 
##  ARIMA(0,1,0)(0,1,0)[4]                    : 577.6158
##  ARIMA(0,1,0)(0,1,0)[4]                    : 577.6158
##  ARIMA(1,1,0)(1,1,0)[4]                    : 541.6877
##  ARIMA(0,1,1)(0,1,1)[4]                    : 518.3907
##  ARIMA(0,1,1)(1,1,1)[4]                    : 520.609
##  ARIMA(0,1,1)(0,1,0)[4]                    : 528.8917
##  ARIMA(0,1,1)(0,1,2)[4]                    : 520.5883
##  ARIMA(0,1,1)(1,1,2)[4]                    : Inf
##  ARIMA(1,1,1)(0,1,1)[4]                    : 515.2757
##  ARIMA(1,1,0)(0,1,1)[4]                    : 532.8352
##  ARIMA(1,1,2)(0,1,1)[4]                    : 516.1387
##  ARIMA(0,1,0)(0,1,1)[4]                    : 556.8553
##  ARIMA(2,1,2)(0,1,1)[4]                    : 517.5581
##  ARIMA(1,1,1)(1,1,1)[4]                    : 517.6134
##  ARIMA(1,1,1)(0,1,0)[4]                    : 526.3492
##  ARIMA(1,1,1)(0,1,2)[4]                    : 517.6132
##  ARIMA(1,1,1)(1,1,2)[4]                    : Inf
##  ARIMA(2,1,1)(0,1,1)[4]                    : 515.1419
##  ARIMA(2,1,0)(0,1,1)[4]                    : 517.9378
##  ARIMA(2,1,1)(1,1,1)[4]                    : 517.5521
##  ARIMA(2,1,1)(0,1,0)[4]                    : 528.3562
##  ARIMA(2,1,1)(0,1,2)[4]                    : 517.5497
##  ARIMA(2,1,1)(1,1,2)[4]                    : Inf
## 
##  Best model: ARIMA(2,1,1)(0,1,1)[4]
# 
# beerfit_m_1 <- arima(beer_data_differ_final,c(0,1,1),seasonal = list(order=c(0,1,1),period=12))
# 
# beerfit_m_2 <- arima(beer_data_differ_final,c(1,0,1),seasonal = list(order=c(1,0,1),period=12))
# 
# beerfit_m_3 <- arima(beer_data_differ_final,c(2,0,1),seasonal = list(order=c(2,0,1),period=12))
# 
# beerfit_m_4 <- arima(beer_data_differ_final,c(1,2,1),seasonal = list(order=c(1,2,1),period=12))
# 
# 
# fit = arima(log(beer_timeseries), c(1, 0, 1),seasonal = list(order = c(1, 0, 1), period = 12))
# fit <- auto.arima(WWWusage)

Final Model and Residual value

We will now use ARIMA(2,1,1)(0,1,1)[4] to generate final model. We will plot the resodualts and test the residuals also. Residual plot shows while the mean is roughly constant, the variance does not seem to be obviously constant, but it is close enough.

acf plot of residuals shows that all the spikes are well below the blue hashed line. Box-Ljung (used for significant evidence for non-zero correlations ) test shows that p-value is 0.6144 and > 0.05. So we failed to reject the null hypothesis.Our null hypothesis is “There is little evidence of non-zero autocorrelation in the forecast errors.”

Further, the residuals are normally distributed, thus we can probably conclude this model as an adequate representation of the data.

beerfit_m_final <- arima(beer_timeseries,c(2,1,1),seasonal = list(order=c(0,1,1),period=12))
# beerfit_m_final$model
beerfit_m_final_forcast <- predict(beerfit_m_final,n.ahead = 8, prediction.interval = TRUE)

plot(residuals(beerfit_m_final))
abline(0, 0) 

acf(beerfit_m_final$residuals)  ## acf test of rsiduals

Box.test(beerfit_m_final$residuals, lag = 12, type='Ljung-Box')
## 
##  Box-Ljung test
## 
## data:  beerfit_m_final$residuals
## X-squared = 10.017, df = 12, p-value = 0.6144