Homework 5 Predictive analytics

problem 7.1

Consider the pigs series - the number of pigs slaughtered in Victoria each month.

a. Use the ses() function in R to find the optimal values of alpha and l0, and generate forecasts for the next four months.

#>         Jan    Feb    Mar    Apr    May    Jun
#> 1980  76378  71947  33873  96428 105084  95741

#> 
#> Forecast method: Simple exponential smoothing
#> 
#> Model Information:
#> Simple exponential smoothing 
#> 
#> Call:
#>  ses(y = pigs, h = 5) 
#> 
#>   Smoothing parameters:
#>     alpha = 0.2971 
#> 
#>   Initial states:
#>     l = 77260.0561 
#> 
#>   sigma:  10308.58
#> 
#>      AIC     AICc      BIC 
#> 4462.955 4463.086 4472.665 
#> 
#> Error measures:
#>                    ME    RMSE      MAE       MPE     MAPE      MASE       ACF1
#> Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
#> 
#> Forecasts:
#>          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
#> Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
#> Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
#> Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
#> Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8
#> Jan 1996       98816.41 83448.52 114184.3 75313.25 122319.6
#>                  ME    RMSE     MAE   MPE MAPE MASE ACF1
#> Training set 385.87 10253.6 7961.38 -0.92 9.27  0.8 0.01

After applying the ses() we got an alpha = 0.2971, initial state l = 77260.056

Compute a 95% prediction interval for the first forecast using ^y ± 1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

The forecast point would be 98816.41, We would get the standard deviation of it’s residuals to be able to calculate required confidence interval

#> [1] "Standard deviation of the residual for the given point is:  10273.693294987"
#> [1] "The lower bound of 95 confidence interval is:  78679.9711418255"
#> [1] "The upper bound of 95 confidence interval is:  118952.848858174"

If we compare the two results (manual and obtained using ses), it appears that it slightly difference. The high bound byses() is pretty wider; however the lower bound showed a tighter range than obtain manually.

problem 7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

Plot the series and discuss the main features of the data.

The plot revealed that the book series is cyclic with an increasing trend for both hardcover and paperback versions. However, there is no evidence of seasonality.

Use the ses() function to forecast each series, and plot the forecasts.

I used the ses() with argument h = 4 to get point of forecasts for both books, hardcover and papaerback.

#> 
#> Forecast method: Simple exponential smoothing
#> 
#> Model Information:
#> Simple exponential smoothing 
#> 
#> Call:
#>  ses(y = books[, 1], h = 4) 
#> 
#>   Smoothing parameters:
#>     alpha = 0.1685 
#> 
#>   Initial states:
#>     l = 170.8271 
#> 
#>   sigma:  34.8183
#> 
#>      AIC     AICc      BIC 
#> 318.9747 319.8978 323.1783 
#> 
#> Error measures:
#>                    ME     RMSE     MAE       MPE     MAPE      MASE       ACF1
#> Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303 -0.2117522
#> 
#> Forecasts:
#>    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
#> 31       207.1097 162.4882 251.7311 138.8670 275.3523
#> 32       207.1097 161.8589 252.3604 137.9046 276.3147
#> 33       207.1097 161.2382 252.9811 136.9554 277.2639
#> 34       207.1097 160.6259 253.5935 136.0188 278.2005

#> 
#> Forecast method: Simple exponential smoothing
#> 
#> Model Information:
#> Simple exponential smoothing 
#> 
#> Call:
#>  ses(y = books[, 2], h = 4) 
#> 
#>   Smoothing parameters:
#>     alpha = 0.3283 
#> 
#>   Initial states:
#>     l = 149.2861 
#> 
#>   sigma:  33.0517
#> 
#>      AIC     AICc      BIC 
#> 315.8506 316.7737 320.0542 
#> 
#> Error measures:
#>                    ME     RMSE      MAE      MPE     MAPE      MASE       ACF1
#> Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887 -0.1417763
#> 
#> Forecasts:
#>    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
#> 31       239.5601 197.2026 281.9176 174.7799 304.3403
#> 32       239.5601 194.9788 284.1414 171.3788 307.7414
#> 33       239.5601 192.8607 286.2595 168.1396 310.9806
#> 34       239.5601 190.8347 288.2855 165.0410 314.0792

Compute the RMSE values for the training data in each case.

#> [1] "The RMSE for the hardcover training data is: 31.93"

#> [1] "The RMSE for the hardcover training data is: 33.64"

problem 7.6

We will continue with the daily sales of paperback and hardcover books in data set books.

Now apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

** For the paperback**

#>    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
#> 31       209.4668 166.6035 252.3301 143.9130 275.0205
#> 32       210.7177 167.8544 253.5811 145.1640 276.2715
#> 33       211.9687 169.1054 254.8320 146.4149 277.5225
#> 34       213.2197 170.3564 256.0830 147.6659 278.7735

For the hardcover

#>    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
#> 31       250.1739 212.7390 287.6087 192.9222 307.4256
#> 32       253.4765 216.0416 290.9113 196.2248 310.7282
#> 33       256.7791 219.3442 294.2140 199.5274 314.0308
#> 34       260.0817 222.6468 297.5166 202.8300 317.3334

Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

#>                 ME  RMSE   MAE   MPE  MAPE MASE  ACF1
#> Training set -3.72 31.14 26.18 -5.51 15.58 0.66 -0.18

#>                 ME  RMSE   MAE   MPE  MAPE MASE  ACF1
#> Training set -0.14 27.19 23.16 -2.11 12.16 0.69 -0.03

For the paperback, the RMSE is 31.14. This is 33.64-31.14 = 2.5 improvement.

For the hardcover, the RMSE is 27.19. This is 31.93-27.19 = 4.74 improvement.

So in terms of prediction accuracy in the training set, Holt’s method is better than the simple exponential smoothing. Holt’s method takes into account the trend element of a time series, while the SES does not have a trend element. The books dataset clearly exhibit a upward trend. Therefore, Holt’s method is more appropriate.

Compare the forecasts for the two series using both methods. Which do you think is best?

The simple ETS method will forecast a constant value without taking into account the trend, while Holt’s method does.

Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

	Paperback Forecast			Hardcover Forecost
	Calculated	R - holt	R - ses	Calculated	R - holt	R - ses
Point Forecast	209.4668	209.4668	207.1097	250.1739	250.1739	239.5601
Lower 95%	148.4324	143.9130	138.8670	196.8815	192.9222	174.7799
Upper 95%	270.5012	275.0205	275.3523	303.4663	307.4256	304.3403
Interval Range	122.0688	131.1075	136.4853	106.5848	114.5034	129.5604

I created a comparasion table to compare the RMSE over the two methods for the two types of book. It seems that the interval calculated using RMSE is slightly narrower than calculated using holt() and ses() method.

problem 7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.

[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Which model gives the best RMSE?

Below, I experiment with the default holt() and the 3 options of the function. The damped=TRUE will use a damped trend. The exponential=TRUE will use an exponential trend. The lambda="auto" will turn on the Box-Cox transformation for the data, and I also use biasadj=TRUE to get the mean forecast (instead of median).

The forecast value is a straight line and can go to negative. The damped trend seems to damp the forecast very quickly into a flat, horizontal line. The exponential trend forecast appears to be very close to the Box-Cox transformed prediction. And they both shows much more gentle decline than the damped trend method.

I also tried 2 combination of the options. The damped and exponential options combine will produce a line similar to damped line. It seems the damped effect out-weights the exponential effect. The damped and Box-Cox transformed produces an increase forecast - which clearly does not make sense.

Below are the accuracy for the forecasts aggregated in a dataframe table:

	ME	RMSE	MAE	MPE	MAPE	MASE	ACF1
Default	0.0449909	26.58219	19.18491	-1.142201	9.653791	0.9463626	0.0134820
Damped	-2.8914955	26.54019	19.27950	-2.907633	10.018944	0.9510287	-0.0031954
Exponential	0.4918791	26.49795	19.29399	-1.263235	9.766049	0.9517436	0.0103908
Box-Cox	-0.2015298	26.38689	18.99362	-1.630430	9.713172	0.9369265	0.0383996
Damped & Exponential	-0.9089678	26.59113	19.54973	-2.125756	10.023283	0.9643590	0.0137612
Damped & Box-Cox	-1.8062134	26.58589	19.55896	-2.584250	10.117605	0.9648141	0.0053221

As demonstrated from the table above, the Box-Cox transformed using holt() method has the lowest RMSE, 26.38.

Problem 7.8

Recall your retail time series data (from Exercise 3 in Section 2.10).

Why is multiplicative seasonality necessary for this series?

the data showed that the seasonality indices increased when the retail sales increased. Multiplicative seasonality can reflect the situation in the model, while additive seasonality can not.

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

From the plots, it seems that the seasonal variation increases with time. Therefore, multiplicative seasonality is more suitable. The forecasts increased more slowly when damped option was used than it wasn’t used.

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

#> [1] 14.72762

#> [1] 14.94306

When the RMSE values were compared, they were almost same. Therefore I prefer damped model because it will prohibit the limitless increase of sales forecast.

Check that the residuals from the best method look like white noise.

#> 
#>  Ljung-Box test
#> 
#> data:  Residuals from Damped Holt-Winters' multiplicative method
#> Q* = 42.932, df = 7, p-value = 3.437e-07
#> 
#> Model df: 17.   Total lags used: 24

This seems to be indeed white noise, with occasional spikes.

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

I will utilize three methods, Seasonal Naive, Holt-Winter’s Multiplicative Trend (Holt-Winter 1), and Holt-Winter’s Additive Trend, with Box-Cox Transform (Holt-Winter 2)

Holt-Winters’ method with damped option:

#>                      ME       RMSE      MAE        MPE      MAPE      MASE
#> Training set  0.4556121   8.681456  6.24903  0.2040939  3.151257 0.3916228
#> Test set     94.7346169 111.911266 94.73462 24.2839784 24.283978 5.9369594
#>                     ACF1 Theil's U
#> Training set -0.01331859        NA
#> Test set      0.60960299   1.90013

Holt-Winters’ method without damped option:

#>                       ME      RMSE       MAE          MPE      MAPE      MASE
#> Training set  0.03021223  9.107356  6.553533  0.001995484  3.293399 0.4107058
#> Test set     78.34068365 94.806617 78.340684 19.945024968 19.945025 4.9095618
#>                    ACF1 Theil's U
#> Training set 0.02752875        NA
#> Test set     0.52802701  1.613903

1. When I used Holt-Winters’ method with damped option, I couldn’t beat seasonal naive approach. 2. When I used Holt-Winters’ method without damped option, I could get better accuracy than when I used the option but it still couldn’t beat the seasonal naive approach. 3. In this case, damped Holt-Winters’ method was worse than Holt-Winters’ method because the actual sales amount in the forecast horizon was exponentially increasing, not damping. 4. I think that this case reflects the fact that the assumption behind the chosen forecast method should be right to forecast more accurately.

problem 7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

The training set is first Box-Cox transformed, and then decomposed using STL.

#> [1] "Best lambda for Box-Cox Transformation is found to be: 0.197968156308491"

I fit the seasonally adjusted data using ETS, and let ETS automatically search for best fit.

#> ETS(M,A,N) 
#> 
#> Call:
#>  ets(y = train.bc.seadj) 
#> 
#>   Smoothing parameters:
#>     alpha = 0.6333 
#>     beta  = 1e-04 
#> 
#>   Initial states:
#>     l = 6.567 
#>     b = 0.0134 
#> 
#>   sigma:  0.0129
#> 
#>      AIC     AICc      BIC 
#> 543.5141 543.6911 562.7319 
#> 
#> Training set error measures:
#>                        ME      RMSE       MAE         MPE      MAPE      MASE
#> Training set -0.003878286 0.1172707 0.0899321 -0.03866332 0.9882063 0.3832231
#>                    ACF1
#> Training set 0.01864534

The ETS(M,A,N), with multiplicative error, additive trend, and no seasonal component.

I then use this to make a forecast on the test set. The forecast is then back transformed using InvBoxCox().

#>           Jan      Feb      Mar      Apr      May      Jun      Jul      Aug
#> 2011 280.4265 281.6456 282.8690 284.0966 285.3286 286.5648 287.8053 289.0500
#> 2012 295.3390 296.6098 297.8850 299.1647 300.4487 301.7371 303.0300 304.3273
#> 2013 310.8809 312.2051 313.5338 314.8670 316.2048 317.5472 318.8941 320.2456
#>           Sep      Oct      Nov      Dec
#> 2011 290.2992 291.5526 292.8104 294.0725
#> 2012 305.6291 306.9353 308.2460 309.5612
#> 2013 321.6016 322.9623 324.3276 325.6975

Since there is no seasonal component, the forecast is a straight line trend. The RMSE is found to be:

#> [1] 302.9204