Forecasting

Supply Chain HW #1: The data set GSSales.csv includes monthly retail sales of supermarkets in the US from January 1992 through January 2015 expressed in millions of US dollars..

Source: https://research.stlouisfed.org/

GSS <- read.csv("GSSales.csv", header=TRUE, sep=",")
dim(GSS)

## [1] 277   1

head(GSS)

##   RSGCSN
## 1  27306
## 2  26223
## 3  27235
## 4  27588
## 5  28883
## 6  28039

#converting GSS to a time series object
#start/end = time of first/last observation
#frequency = the number of observations per unit of time
GSS <- ts(GSS, start=1992, frequency=12)[ , 1]

#Visualize the data
#strong increasing trend w/ seasonality
ts.plot(GSS, col="red", main = "Supermarket Sales in the US", ylab = "Monthly Retail Sales (Millions)", xlim= c(1990,2015))

plot of chunk unnamed-chunk-1

Splitting Into Training and Test Sets

Use the window(…) function to separate the GSS data set into a training set (y.TR) including all the data up to January 2013 and a testing data set (y.TE) including the data from February 2013 until the end of the time series (January 2015).

#window is a generic function which extracts the subset of the object x observed between the times start and end.
y.TR <- window(GSS, start=1992,end=2013)
y.TE <- window(GSS, start=2013, end=2015)

Problem Set

(5 pts.) Use the ets(…) function to fit a Holt-Winters exponential smoothing model to the training subset of the supermarket sales data. Report the model details including the optimal (MLE) value of each of the constants and smoothing parameters required by the model, the AIC, AICc and BIC values, as well as the in-sample fitting indicators. Then plot the original time series and overlay, using a different color, the fitted model.

If you have a time series that can be described using an additive model with increasing or decreasing trend and seasonality, you can use Holt-Winters exponential smoothing to make short-term forecasts.

#install.packages("forecast")
library(forecast)

fit.HW <- ets(y.TR, model="AAM", damped=FALSE, restrict=FALSE)
summary(fit.HW)

## ETS(A,A,M) 
## 
## Call:
##  ets(y = y.TR, model = "AAM", damped = FALSE, restrict = FALSE) 
## 
##   Smoothing parameters:
##     alpha = 0.5048 
##     beta  = 0.0879 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 27862.6911 
##     b = 33.0245 
##     s=1.0651 0.9976 0.9964 0.9775 1.017 1.0315
##            1.0006 1.0308 0.976 0.9999 0.9208 0.9869
## 
##   sigma:  594.1216
## 
##      AIC     AICc      BIC 
## 4663.812 4666.117 4720.346 
## 
## Training set error measures:
##                    ME     RMSE      MAE         MPE     MAPE      MASE
## Training set 2.760137 594.1216 461.6641 -0.00959773 1.304683 0.4426381
##                    ACF1
## Training set -0.4297397

plot(y.TR, main="Holt-Winters Forecast", ylab = "Supermarket Sales (training set)", cex.lab = 1.5, cex.main = 1.5, cex.axis = 1.5)
lines(fit.HW$fitted, col="orange")

plot of chunk unnamed-chunk-3

2. (5 pts.) Now use your model in Question (1) to prepare a two-year ahead (i.e., h=24) forecast for the sales of supermarkets in the US. Plot the forecast and overlay (with points of a different color) the actual sales (the testing data set y.TE). Calculate the RMSE of the two-year ahead forecast and compare it to the RMSE reported for the model on the training data set. Is this an apples-to-apples comparison? Explain.

#?forecast
forecast.HW <- forecast(fit.HW, h=24)
summary(forecast.HW)

## 
## Forecast method: ETS(A,A,M)
## 
## Model Information:
## ETS(A,A,M) 
## 
## Call:
##  ets(y = y.TR, model = "AAM", damped = FALSE, restrict = FALSE) 
## 
##   Smoothing parameters:
##     alpha = 0.5048 
##     beta  = 0.0879 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 27862.6911 
##     b = 33.0245 
##     s=1.0651 0.9976 0.9964 0.9775 1.017 1.0315
##            1.0006 1.0308 0.976 0.9999 0.9208 0.9869
## 
##   sigma:  594.1216
## 
##      AIC     AICc      BIC 
## 4663.812 4666.117 4720.346 
## 
## Error measures:
##                    ME     RMSE      MAE         MPE     MAPE      MASE
## Training set 2.760137 594.1216 461.6641 -0.00959773 1.304683 0.4426381
##                    ACF1
## Training set -0.4297397
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Feb 2013       43958.65 43182.76 44720.63 42798.29 45105.32
## Mar 2013       47831.65 46930.32 48711.81 46464.30 49217.28
## Apr 2013       46778.75 45715.68 47831.47 45198.52 48392.00
## May 2013       49500.53 48258.41 50727.48 47636.63 51399.84
## Jun 2013       48144.81 46754.58 49522.26 46016.23 50243.06
## Jul 2013       49727.35 48146.92 51317.57 47281.03 52163.11
## Aug 2013       49120.23 47337.43 50901.94 46458.75 51851.99
## Sep 2013       47304.29 45402.99 49263.63 44330.11 50292.75
## Oct 2013       48313.02 46181.67 50498.33 45015.16 51644.87
## Nov 2013       48464.46 46130.62 50840.75 44862.81 52141.24
## Dec 2013       51843.10 49133.35 54659.56 47654.64 56142.65
## Jan 2014       48129.21 45358.58 50987.65 43878.92 52492.28
## Feb 2014       44989.09 42175.34 47890.48 40740.56 49375.93
## Mar 2014       48950.69 45712.42 52361.86 43846.44 54112.19
## Apr 2014       47871.03 44390.21 51404.87 42500.44 53306.32
## May 2014       50654.11 46657.70 54640.70 44554.80 56762.57
## Jun 2014       49264.63 45175.95 53402.68 42955.89 55632.94
## Jul 2014       50881.74 46393.41 55431.74 43911.67 57838.73
## Aug 2014       50258.33 45541.96 55097.60 42925.62 57498.68
## Sep 2014       48398.20 43558.17 53282.54 40912.94 55793.11
## Oct 2014       49428.10 44223.76 54695.34 41303.17 57477.35
## Nov 2014       49580.90 44127.54 55178.23 41079.50 58211.88
## Dec 2014       53035.08 46825.29 59243.81 43493.29 62565.30
## Jan 2015       49233.68 43205.14 55344.45 39854.58 58643.76

plot(forecast.HW, main = "2 Year Forecast Using Holt-Winters Model", ylab = "Supermarket Sales", xlab = "Years", xlim = c(1990, 2015), cex.lab = 1.5, cex.main = 1.5, cex.axis = 1.5)
lines(y.TE, col="red")

plot of chunk unnamed-chunk-4

#install.packages("Metrics")
library(Metrics)
#input = forecasted values, actual values
accuracy(forecast.HW, y.TE)

##                      ME     RMSE      MAE         MPE     MAPE      MASE
## Training set   2.760137 594.1216 461.6641 -0.00959773 1.304683 0.4426381
## Test set     238.598058 763.5727 619.2516  0.47372220 1.251063 0.5937310
##                    ACF1 Theil's U
## Training set -0.4297397        NA
## Test set     -0.5065135 0.3023494

This is not an “apples-to-apples” comparison because you are effectively comparing the in-sample training fit of the Holt-Winters model to its ability to make predictions about the future. Thus, you can expect that the RMSE reported for the in-sample training fit to be lower than that of its forecasting fit. As expected, the RMSE for the in-sample training fit is: 594.1216 and the RMSE for the forecasting fit is: 763.5727.

3. (15 pts.) Use the ets(y.TR, …) function to optimize the selection of exponential smoothing model for the GSS time series over the training set. Report the model details including the type of model, the optimal (MLE) value of each of the constants and smoothing parameters required by the model, the AIC, AICc and BIC values, as well as the training set fitting indicators.

fit.optimal <- ets(y.TR, model = "ZZZ", restrict = FALSE)
summary(fit.optimal)

## ETS(A,Ad,A) 
## 
## Call:
##  ets(y = y.TR, model = "ZZZ", restrict = FALSE) 
## 
##   Smoothing parameters:
##     alpha = 0.1434 
##     beta  = 0.1434 
##     gamma = 1e-04 
##     phi   = 0.8253 
## 
##   Initial states:
##     l = 28114.5515 
##     b = 23.1944 
##     s=2417.962 -175.2942 -165.3562 -830.5298 581.6487 1134.615
##            38.2832 1171.974 -771.9525 68.0054 -2881.328 -588.0279
## 
##   sigma:  520.4662
## 
##      AIC     AICc      BIC 
## 4598.838 4601.443 4658.906 
## 
## Training set error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 95.66111 520.4662 411.2271 0.2456933 1.156651 0.3942797
##                    ACF1
## Training set -0.2821256

We can see a reduction in the training set fitting indicators here (RMSE) over the previous model. The model selected corresponds to “A” - additive errors, “Ad” - an additive trend type, and “A” - an additive seasonal effect from the ets (Exponential Smoothing State Space Model) package.

a. Examine the training-set value of the mean error (ME) and mean absolute percentage error (MAPE) and mean absolute scaled errors (MASE) for both models. Which model is a better fit to the training data? Explain.

According to the MAPE and MASE metrics, the second model, ETS(A,Ad,A) performs marginally better. While the ME (mean error) metric of the first model, ETS(A,A,M) is far better (ME: 2.7601) than that of the second model (ME: 95.6611), the MASE and MAPE metrics (which are more robust than ME) tell us that the second model fits the training data better.

b. According to the AICc criterion, which of these two models would you expect to perform better in a realistic forecasting situation?

According the corrected AIC metric, which measures the information that is lost by the model (adjusted for the number of features and size of the dataset), the second model (ETS(A,Ad,A) -> AICc = 4598.838) should perform better than the first model (ETS(A,A,M) -> AICc = 4663.812) in a realistic forecasting situation.

c. Now compute the two-year ahead forecast (i.e., h=24) for this model and compute the RMSE for this model with respect to the testing data set. Compare the testing-set RMSE of this model with the testing-set RMSE of the model you fitted in Question (1). Which one is better? Is this an apples-to-apples comparison? Comment on this comparison in relation to your answer in part (b) above.

forecast.optimal <- forecast(fit.optimal, h=24)
summary(forecast.optimal)

## 
## Forecast method: ETS(A,Ad,A)
## 
## Model Information:
## ETS(A,Ad,A) 
## 
## Call:
##  ets(y = y.TR, model = "ZZZ", restrict = FALSE) 
## 
##   Smoothing parameters:
##     alpha = 0.1434 
##     beta  = 0.1434 
##     gamma = 1e-04 
##     phi   = 0.8253 
## 
##   Initial states:
##     l = 28114.5515 
##     b = 23.1944 
##     s=2417.962 -175.2942 -165.3562 -830.5298 581.6487 1134.615
##            38.2832 1171.974 -771.9525 68.0054 -2881.328 -588.0279
## 
##   sigma:  520.4662
## 
##      AIC     AICc      BIC 
## 4598.838 4601.443 4658.906 
## 
## Error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 95.66111 520.4662 411.2271 0.2456933 1.156651 0.3942797
##                    ACF1
## Training set -0.2821256
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Feb 2013       44807.06 44140.05 45474.06 43786.96 45827.15
## Mar 2013       47826.45 47117.66 48535.24 46742.45 48910.45
## Apr 2013       47044.58 46277.41 47811.75 45871.30 48217.86
## May 2013       49036.26 48197.97 49874.55 47754.21 50318.31
## Jun 2013       47942.09 47023.94 48860.24 46537.90 49346.28
## Jul 2013       49071.02 48067.62 50074.43 47536.45 50605.60
## Aug 2013       48544.97 47453.43 49636.51 46875.61 50214.33
## Sep 2013       47154.90 45974.14 48335.66 45349.08 48960.72
## Oct 2013       47838.41 46568.56 49108.27 45896.34 49780.49
## Nov 2013       47843.61 46485.61 49201.61 45766.73 49920.49
## Dec 2013       50449.27 49004.61 51893.93 48239.85 52658.69
## Jan 2014       47453.69 45924.14 48983.24 45114.44 49792.94
## Feb 2014       45168.80 43556.38 46781.23 42702.81 47634.79
## Mar 2014       48124.99 46431.78 49818.20 45535.44 50714.54
## Apr 2014       47290.96 45519.06 49062.85 44581.08 50000.84
## May 2014       49239.59 47391.11 51088.08 46412.58 52066.61
## Jun 2014       48109.89 46186.87 50032.92 45168.88 51050.91
## Jul 2014       49209.51 47213.92 51205.10 46157.51 52261.50
## Aug 2014       48659.26 46593.01 50725.51 45499.20 51819.32
## Sep 2014       47249.22 45114.13 49384.31 43983.88 50514.56
## Oct 2014       47916.25 45714.06 50118.45 44548.29 51284.22
## Nov 2014       47907.85 45640.20 50175.50 44439.78 51375.92
## Dec 2014       50502.29 48170.76 52833.82 46936.52 54068.05
## Jan 2015       47497.44 45103.50 49891.38 43836.23 51158.66

#input = forecasted values, actual values
accuracy(forecast.optimal, y.TE)

##                      ME      RMSE       MAE       MPE     MAPE      MASE
## Training set   95.66111  520.4662  411.2271 0.2456933 1.156651 0.3942797
## Test set     1062.48478 1500.7733 1248.6795 2.1005723 2.509882 1.1972190
##                     ACF1 Theil's U
## Training set -0.28212557        NA
## Test set      0.09174116 0.5842095

#Run comparison via test RMSE of model 1, ETS(A,A,M)
accuracy(forecast.HW, y.TE)

##                      ME     RMSE      MAE         MPE     MAPE      MASE
## Training set   2.760137 594.1216 461.6641 -0.00959773 1.304683 0.4426381
## Test set     238.598058 763.5727 619.2516  0.47372220 1.251063 0.5937310
##                    ACF1 Theil's U
## Training set -0.4297397        NA
## Test set     -0.5065135 0.3023494

The test set RMSE of model 1, ETS(A,A,M), is: 763.5727. The test set RMSE of model 2, ETS(A,Ad,A), is: 1500.7733. A lower RMSE is obviously better thus, model 1 outperformed model 2. However, this is not an apples-to-apples comparison because the models are making fundamentally different assumptions about the process by which future data is generated. For example, model 1 assumes that seasonal effects are multiplicative while model 2 assumes that these same effects are additive. The corrected AIC metric tells us that model 2 will outperform model 1 based on in-sample fit. However, if model 2's assumptions are incorrect, then it will not be able to capture the increasing magnitude of seasonal variation as the level of the series increases. Thus, model 2 did not perform well (compared to model 1) when having to make out of sample predictions.

4. (15 pts.) Obtain the value of lambda for the Box-Cox transformation of the GSS time series. Then repeat the exercise in question (3 (a)-©) for the Box-Cox transformed time series and report the best model, value of the model parameters, information coefficients, in-sample fit and plot the fitted model.

#get value of lambda for box-cox transformation

transformed_data <- BoxCox(GSS, BoxCox.lambda(GSS))
#head(transformed_data)
#head(GSS)
BoxCox.lambda(GSS)

## [1] 0.4384686

#fit optimal ets model to newly transformed data
fit.trans.optimal <- ets(y.TR, model = "ZZZ", restrict = FALSE, lambda = BoxCox.lambda(GSS))
summary(fit.trans.optimal)

## ETS(A,A,A) 
## 
## Call:
##  ets(y = y.TR, model = "ZZZ", lambda = BoxCox.lambda(GSS), restrict = FALSE) 
## 
##   Box-Cox transformation: lambda= 0.4385 
## 
##   Smoothing parameters:
##     alpha = 0.2505 
##     beta  = 0.0031 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 200.7463 
##     b = 0.1799 
##     s=6.6516 -0.3983 -0.4166 -2.2496 1.7098 3.1442
##            0.1431 3.1963 -2.1013 0.0918 -8.1424 -1.6286
## 
##   sigma:  1.4271
## 
##      AIC     AICc      BIC 
## 1611.915 1614.220 1668.449 
## 
## Training set error measures:
##                    ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 35.78683 519.7675 397.9248 0.06757621 1.117885 0.3815255
##                    ACF1
## Training set -0.2518541

Optimal model returned = ETS(A,A,A) = Additive error, Additive trend, Additive seasonal effect

a. Examine the training-set value of the mean error (ME) and mean absolute percentage error (MAPE) and mean absolute scaled errors (MASE) for both models (i.e., the one in Question (1) and the one you obtained in Question (4)) Which model is a better fit to the training data? Explain.

Similar to the results we obtained in question 3, model 3 (ETS(A,A,A)) marginally outperforms model 1 (ETS(A,A,M)) according to the MASE and MAPE metrics. Although, model 1 has a substantially lower ME (2.7601) than model 3 (35.7868), mean absolute scaled error is a more robust metric than mean error since it controls for variability across one time period to the next. Thus, we can conclude that model 3 provides a better fit to the training data than model 1.

b. Is it valid to compare the AICc criterion values to select which of these two models is more likely to perform better in a realistic forecasting situation? Explain.

No, it is not valid to compare the AICc criterion values of model 1 and model 3 to determine which is more likely to perform better in a realistic forecasting situation. This is because model 1 is fit to the original time series while model 3 is fit to the transformed time series. AICc can only be used as a metric for comparison across models when the models are fit on the same dataset. In this case, we used a BoxCox transformation in an attempt to remedy our non-constant variance (heteroskedasticity) issue.

c. As in Question (3) now compute the two-year ahead forecast (i.e., h=24) for this model and compute the RMSE for this model with respect to the testing data set. Compare the testing-set RMSE of this model with the testing-set RMSE of the model you fitted in Question (1). Which one is better? Is this an apples-to-apples comparison?

forecast.transformed.optimal <- forecast(fit.trans.optimal, h=24)
summary(forecast.transformed.optimal)

## 
## Forecast method: ETS(A,A,A)
## 
## Model Information:
## ETS(A,A,A) 
## 
## Call:
##  ets(y = y.TR, model = "ZZZ", lambda = BoxCox.lambda(GSS), restrict = FALSE) 
## 
##   Box-Cox transformation: lambda= 0.4385 
## 
##   Smoothing parameters:
##     alpha = 0.2505 
##     beta  = 0.0031 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 200.7463 
##     b = 0.1799 
##     s=6.6516 -0.3983 -0.4166 -2.2496 1.7098 3.1442
##            0.1431 3.1963 -2.1013 0.0918 -8.1424 -1.6286
## 
##   sigma:  1.4271
## 
##      AIC     AICc      BIC 
## 1611.915 1614.220 1668.449 
## 
## Error measures:
##                    ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 35.78683 519.7675 397.9248 0.06757621 1.117885 0.3815255
##                    ACF1
## Training set -0.2518541
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Feb 2013       44455.72 43714.22 45204.23 43324.53 45603.31
## Mar 2013       47986.24 47187.15 48792.88 46767.18 49222.94
## Apr 2013       47162.94 46347.08 47986.81 45918.42 48426.18
## May 2013       49530.38 48666.64 50402.67 48212.86 50867.89
## Jun 2013       48324.31 47447.94 49209.70 46987.67 49682.05
## Jul 2013       49722.32 48806.94 50647.26 48326.23 51140.77
## Aug 2013       49208.81 48274.23 50153.47 47783.56 50657.63
## Sep 2013       47622.63 46681.16 48574.67 46187.05 49082.94
## Oct 2013       48508.08 47532.62 49494.68 47020.74 50021.47
## Nov 2013       48621.79 47620.93 49634.35 47095.83 50175.12
## Dec 2013       51804.18 50741.95 52878.77 50184.64 53452.65
## Jan 2014       48307.49 47262.42 49365.42 46714.38 49930.67
## Feb 2014       45674.40 44638.75 46723.41 44095.91 47284.14
## Mar 2014       49258.04 48153.22 50376.95 47574.05 50974.99
## Apr 2014       48422.52 47304.50 49555.22 46718.59 50160.79
## May 2014       50824.88 49651.44 52013.72 49036.49 52649.32
## Jun 2014       49601.11 48419.62 50798.62 47800.64 51439.04
## Jul 2014       51019.61 49794.70 52261.25 49153.03 52925.34
## Aug 2014       50498.61 49256.46 51758.15 48605.94 52431.97
## Sep 2014       48889.05 47645.55 50150.56 46994.56 50825.67
## Oct 2014       49787.58 48507.13 51086.80 47836.87 51782.19
## Nov 2014       49902.97 48596.69 51228.73 47913.06 51938.46
## Dec 2014       53131.52 51753.14 54530.28 51031.70 55279.01
## Jan 2015       49584.04 48234.33 50954.70 47528.30 51688.79

#input = forecasted values, actual values
accuracy(forecast.transformed.optimal, y.TE)

##                     ME     RMSE      MAE         MPE     MAPE      MASE
## Training set  35.78683 519.7675 397.9248  0.06757621 1.117885 0.3815255
## Test set     -19.21735 730.1551 611.0610 -0.06214161 1.245236 0.5858780
##                    ACF1 Theil's U
## Training set -0.2518541        NA
## Test set     -0.5890390 0.2898989

#Run comparison via test RMSE of model 1, ETS(A,A,M)
accuracy(forecast.HW, y.TE)

##                      ME     RMSE      MAE         MPE     MAPE      MASE
## Training set   2.760137 594.1216 461.6641 -0.00959773 1.304683 0.4426381
## Test set     238.598058 763.5727 619.2516  0.47372220 1.251063 0.5937310
##                    ACF1 Theil's U
## Training set -0.4297397        NA
## Test set     -0.5065135 0.3023494

Based on the test set RMSE of each model, we can see that this model outperforms the model used in question 1. However, this is not an apples-to-apples comparison, as the model used in question 1 is fit for and is making predictions for the original time series dataset. In contrast, the current model is fit for and is making predictions for the transformed time series dataset. In order to make a meaningful comparison, we would need to look at RMSE on the same scale.

5. (10 pts.) Now optimize the fit of the Holt-Winters model to the Box-Cox transformed training set data, and compare this model with the one you obtained in Question (4).

fit.transHW.optimal <- ets(y.TR, model = "AAM", restrict = FALSE, lambda = BoxCox.lambda(GSS))
summary(fit.transHW.optimal)

## ETS(A,Ad,M) 
## 
## Call:
##  ets(y = y.TR, model = "AAM", lambda = BoxCox.lambda(GSS), restrict = FALSE) 
## 
##   Box-Cox transformation: lambda= 0.4385 
## 
##   Smoothing parameters:
##     alpha = 0.219 
##     beta  = 0.0325 
##     gamma = 0.126 
##     phi   = 0.9723 
## 
##   Initial states:
##     l = 200.7501 
##     b = 0.0727 
##     s=1.033 0.9969 0.9993 0.991 1.0082 1.0166
##            1.0006 1.0124 0.9902 0.9981 0.9625 0.9911
## 
##   sigma:  1.5251
## 
##      AIC     AICc      BIC 
## 1647.501 1650.105 1707.569 
## 
## Training set error measures:
##                    ME     RMSE      MAE       MPE    MAPE      MASE
## Training set 56.27485 560.3428 424.9774 0.1487627 1.18191 0.4074632
##                    ACF1
## Training set -0.2114613

a. Examine the training-set value of the mean error (ME) and mean absolute percentage error (MAPE) and mean absolute scaled errors (MASE) for both models (i.e., the one in Question (4) and the one you obtained in Question (5)) Which model is a better fit to the training data? Explain.

Based on ME, MAPE, and MASE, the model used in question 4 provides a better fit to the training data than the model above (used in question 5).

b. Compute the two-year ahead forecast (i.e., h=24) for this model and compute the RMSE for this model with respect to the testing data set. Compare the testing-set RMSE of this model with the testing-set RMSE of the model you fitted in Question (4). Which one is better?

forecast.transHW.optimal <- forecast(fit.transHW.optimal, h=24)
summary(forecast.transHW.optimal)

## 
## Forecast method: ETS(A,Ad,M)
## 
## Model Information:
## ETS(A,Ad,M) 
## 
## Call:
##  ets(y = y.TR, model = "AAM", lambda = BoxCox.lambda(GSS), restrict = FALSE) 
## 
##   Box-Cox transformation: lambda= 0.4385 
## 
##   Smoothing parameters:
##     alpha = 0.219 
##     beta  = 0.0325 
##     gamma = 0.126 
##     phi   = 0.9723 
## 
##   Initial states:
##     l = 200.7501 
##     b = 0.0727 
##     s=1.033 0.9969 0.9993 0.991 1.0082 1.0166
##            1.0006 1.0124 0.9902 0.9981 0.9625 0.9911
## 
##   sigma:  1.5251
## 
##      AIC     AICc      BIC 
## 1647.501 1650.105 1707.569 
## 
## Error measures:
##                    ME     RMSE      MAE       MPE    MAPE      MASE
## Training set 56.27485 560.3428 424.9774 0.1487627 1.18191 0.4074632
##                    ACF1
## Training set -0.2114613
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Feb 2013       44112.62 43322.09 44917.14 42915.68 45373.87
## Mar 2013       47821.26 46969.81 48661.60 46548.83 49129.71
## Apr 2013       46798.73 45946.54 47681.54 45475.57 48136.22
## May 2013       49318.48 48377.48 50262.32 47880.66 50778.43
## Jun 2013       47843.45 46896.87 48803.60 46412.59 49339.81
## Jul 2013       49231.06 48210.36 50275.49 47665.06 50920.64
## Aug 2013       48749.75 47639.79 49805.12 47084.47 50392.55
## Sep 2013       46936.45 45875.65 48071.60 45327.96 48612.14
## Oct 2013       47934.31 46783.66 49105.02 46202.93 49715.77
## Nov 2013       48066.33 46857.39 49297.82 46183.40 50018.49
## Dec 2013       51159.81 49778.73 52549.65 49106.89 53321.11
## Jan 2014       47880.73 46542.44 49255.78 45838.57 49998.89
## Feb 2014       44694.84 43302.39 46109.75 42604.44 46803.71
## Mar 2014       48434.73 46895.54 50028.99 46087.21 50927.42
## Apr 2014       47382.04 45780.04 49011.60 44985.03 49905.85
## May 2014       49915.93 48189.28 51688.32 47276.57 52612.31
## Jun 2014       48406.57 46585.62 50226.37 45700.89 51245.58
## Jul 2014       49794.20 47845.25 51759.49 46829.19 52797.98
## Aug 2014       49291.61 47294.71 51325.20 46339.00 52359.63
## Sep 2014       47443.36 45449.73 49474.10 44454.98 50561.10
## Oct 2014       48437.43 46320.54 50641.49 45276.95 51747.73
## Nov 2014       48556.61 46391.09 50834.41 45249.09 51931.25
## Dec 2014       51667.06 49257.37 54124.06 48006.28 55388.47
## Jan 2015       48341.98 46012.82 50774.07 44816.14 52103.42

#input = forecasted values, actual values
accuracy(forecast.transHW.optimal, y.TE)

##                     ME      RMSE      MAE       MPE     MAPE      MASE
## Training set  56.27485  560.3428 424.9774 0.1487627 1.181910 0.4074632
## Test set     798.69478 1125.6688 942.3610 1.5947219 1.897227 0.9035244
##                    ACF1 Theil's U
## Training set -0.2114613        NA
## Test set     -0.1373766 0.4415551

#Run comparison via test RMSE of model 3, forecast.transformed.optimal
accuracy(forecast.transformed.optimal, y.TE)

##                     ME     RMSE      MAE         MPE     MAPE      MASE
## Training set  35.78683 519.7675 397.9248  0.06757621 1.117885 0.3815255
## Test set     -19.21735 730.1551 611.0610 -0.06214161 1.245236 0.5858780
##                    ACF1 Theil's U
## Training set -0.2518541        NA
## Test set     -0.5890390 0.2898989

The model used in question 4, ETS(A,A,A) outperforms the above model used in question 5, ETS(A,A,M). The model used in question 5 has a test RMSE of 1125.6688, while the model used in question 4 has a test RMSE of 730.1551.

6. (5 pts) Compare the models obtained in Questions 1, 3, 4 and 5. Which model has the better in-sample fit? Which model is more promising for out-of-sample predictions? Justify your answer.

The model obtained in question 4 has the both the best in-sample fit (based on MASE and MAPE metrics), as well as the best out-of-sample performance, as evidenced by its test set RMSE: 730.1551.

7. (5 pts) Plot the two-year ahead forecasts using the model you obtained in Question 4, and overlay the test set data. Compare this plot on the equivalent plot for the model you obtained for the model in Question 3. Comment on the differences.

#plot for model obtained in question 4 using BoxCox Transformation ETS(A,A,A)
plot(forecast.transformed.optimal, main = "2 Yr. Forecast Using BoxCox Transformation", ylab = "Supermarket Sales", xlab = "Years", xlim = c(1990, 2015), cex.lab = 1.5, cex.main = 1.5, cex.axis = 1.5)
lines(y.TE, col="green")

plot of chunk unnamed-chunk-11

#plot for model obtained in question 3 without transformation ETS(A,Ad,A)
plot(forecast.optimal, main = "2 Yr. Forecast Without Transformation", ylab = "Supermarket Sales", xlab = "Years", xlim = c(1990, 2015), cex.lab = 1.5, cex.main = 1.5, cex.axis = 1.5)
lines(y.TE, col="yellow")

plot of chunk unnamed-chunk-11
As you can see via the plots provided, the model that uses the BoxCox transformation provides a much better fit to the observed data. In contrast, the model that does not use the BoxCox transformation consistently underpredicts monthly sales values. It's worth noting that these models are essentially the same aside from the transformation: without transformation - ETS(A,Ad,A), with BoxCox transformation - ETS(A,A,A).

8. (5 pts.) Compare the forecast obtained by the Box-Cox transformed model (Question 5) with those of the optimized Box-Cox transformed model (Question 4). Which forecasts would you consider more reliable? Why?

I’d consider the forecasts made by the model obtained in question 4 to be more reliable given that the test RMSE is lower. The primary difference between the two models is that the model obtained in question 4 assumes an additive seasonal effect, whereas the model obtained in question 5 assumes a multiplicative effect. When comparing these two models without the BoxCox transformation, we found that the AAM model (multiplicative effect) outperformed the AAA (additive effect) model when forecasting two years into the future. However, the purpose of the BoxCox transformation is to effectively “solve” the non-constant variance problem caused by a positive sales trend that increases with the level of the data. Thus, since the magnitude of the seasonal pattern in the data no longer depends on the magnitude of the data (after the BoxCox transformation), it’s unsurprising that the additive model (question 4) outperforms the multiplicative model (question 5).

9. (5 pts.) Print the mean, 80% and 95% monthly prediction intervals for 1 through 24 months ahead starting in February 2015 using a Box-Cox transformed model and the optimized model you obtained in Question 4. To answer this question calibrate the model’s parameters using he complete data set.

full.fit.optimal.transformed <- ets(GSS, model = "ZZZ", restrict = FALSE, lambda = BoxCox.lambda(GSS))


#level= c(80,95) to specify prediction intervals
forecast.full.fit.optimal.transformed <- forecast(full.fit.optimal.transformed, h=24, level = c(80,95), interval = "prediction")

#mean and 80% and 95% monthly prediction intervals for 1 through 24 months ahead 
summary(forecast.full.fit.optimal.transformed)

## 
## Forecast method: ETS(A,A,A)
## 
## Model Information:
## ETS(A,A,A) 
## 
## Call:
##  ets(y = GSS, model = "ZZZ", lambda = BoxCox.lambda(GSS), restrict = FALSE) 
## 
##   Box-Cox transformation: lambda= 0.4385 
## 
##   Smoothing parameters:
##     alpha = 0.2354 
##     beta  = 0.0028 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 200.3035 
##     b = 0.178 
##     s=6.3424 -0.3692 -0.4883 -2.4532 1.8117 3.1709
##            0.2225 3.3396 -2.2256 0.2263 -8.1814 -1.3958
## 
##   sigma:  1.466
## 
##      AIC     AICc      BIC 
## 1801.773 1803.865 1859.757 
## 
## Error measures:
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 39.85137 545.2557 420.0808 0.0759403 1.141433 0.3957194
##                    ACF1
## Training set -0.2930036
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Feb 2015       47116.77 46329.81 47911.17 45916.23 48334.73
## Mar 2015       50841.52 49996.65 51694.35 49552.62 52149.03
## Apr 2015       49881.37 49022.45 50748.67 48571.16 51211.19
## May 2015       52446.09 51538.92 53362.17 51062.28 53850.72
## Jun 2015       51175.02 50257.07 52102.32 49774.91 52596.98
## Jul 2015       52597.98 51642.19 53563.62 51140.21 54078.79
## Aug 2015       52104.54 51130.47 53088.94 50619.01 53614.24
## Sep 2015       50335.66 49357.53 51324.57 48844.11 51852.44
## Oct 2015       51310.21 50298.40 52333.34 49767.36 52879.55
## Nov 2015       51474.82 50438.12 52523.37 49894.12 53083.25
## Dec 2015       54607.60 53511.96 55715.72 52937.01 56307.39
## Jan 2016       51245.84 50165.79 52338.81 49599.28 52922.65
## Feb 2016       48409.50 47341.39 49491.01 46781.38 50068.96
## Mar 2016       52190.35 51052.78 53342.02 50456.28 53957.39
## Apr 2016       51215.91 50067.46 52379.01 49465.42 53000.67
## May 2016       53818.52 52613.92 55038.46 51982.44 55690.47
## Jun 2016       52528.78 51317.31 53756.16 50682.41 54412.34
## Jul 2016       53972.63 52718.85 55242.98 52061.83 55922.19
## Aug 2016       53471.97 52201.26 54759.87 51535.52 55448.62
## Sep 2016       51676.97 50407.44 52964.26 49742.57 53652.92
## Oct 2016       52665.96 51359.43 53990.95 50675.24 54699.85
## Nov 2016       52833.00 51500.72 54184.41 50803.18 54907.58
## Dec 2016       56011.33 54610.16 57432.47 53876.48 58192.88
## Jan 2017       52600.64 51224.88 53996.91 50504.88 54744.37

#just for fun
plot.forecast(forecast.full.fit.optimal.transformed, main = "2 Yr. Forecast Post-Feb. 2015", ylab = "Supermarket Sales", xlab = "Years", xlim = c(1990, 2020), plot.conf=TRUE, cex.lab = 1.5, cex.main = 1.5, cex.axis = 1.5)
lines(forecast.full.fit.optimal.transformed$mean, col = "red")

plot of chunk unnamed-chunk-12