Chapter 7 Exponential smoothing
7.1 Consider the pigs series — the number of pigs slaughtered in Victoria each month.
a. Use the ses() function in R to find the optimal values of α and ℓ0 , and generate forecasts for the next four months.
We can find the optimal values of α and ℓ0 from the summary of fitting parameters of the ses() function:
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = pigs, h = 4)
##
## Smoothing parameters:
## alpha = 0.2971
##
## Initial states:
## l = 77260.0561
##
## sigma: 10308.58
##
## AIC AICc BIC
## 4462.955 4463.086 4472.665
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249
## ACF1
## Training set 0.01282239
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Sep 1995 98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995 98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995 98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995 98816.41 83958.37 113674.4 76092.99 121539.8
The ses() function calculated a Smoothing parameters alpha of 0.30 and a Initial states l of 77260.06 (both parameters rounded to 2 decimals).
b. Compute a 95% prediction interval for the first forecast using ^y±1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.
## We calculated the upper bound of the 95% prediction interval to be 118952.8 and the lower is 78679.97 These values are comparable to the calculated values by the ses() function of 119020.8 and 78611.97
7.5 Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.
a. Plot the series and discuss the main features of the data.
The two time series show a positive trend along the range of the series. The data set is for daily book sales however we don't see any weekly seasonality in the set. The variance both sets appeas to be similar in the sold units range.
b. Use the ses() function to forecast each series, and plot the forecasts.
The plot of the simple exponential smoothing forecast shows a confidence interval matching the range variance of the number of books sold. As expected, the forecast does not seem to account for the observed increasing trend in the historic values.
c. Compute the RMSE values for the training data in each case.
We can either calculate the RMSE from the residuals or extract the values directly from the forecast model:
RMSE hardcover from Residuals: 31.93
RMSE hardcover from Forecast: 31.93
RMSE paperback from Residuals: 33.64
RMSE paperback from Forecast: 33.64
7.6 We will continue with the daily sales of paperback and hardcover books in data set books.
a. Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.
The first four days of the paperback forecast are:
209.47, 210.72, 211.97, 213.22
The first four days of the Hardcover forecast are:
250.17, 253.48, 256.78, 260.08
b. Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.
Let's compare the RMSE values from both series using S.E.S. and Holt's method:
RMSE hardcover from Forecast using S.E.S.: 31.93
RMSE hardcover from Forecast using Holt's: 27.19
RMSE hardcover from Forecast using S.E.S.: 33.64
RMSE hardcover from Forecast using Holt's: 31.14
RMSE quantifies the spread of the residuals. The lower RMSE of the forecast using Holt's method indicates that it generates closer to the line of best fit. Holt's method captures the increasing trend present in the Books series that S.E.S. is not capable of.
c. Compare the forecasts for the two series using both methods. Which do you think is best?
Which one is the best method? Neither seems to be optimal. S.E.S. captures better seasonality but trend while Holt's mehod captures trend but not seasonality. A combination of both is what we need.
d. Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.
95% prediction interval for:
Hardcover S.E.S: Upper = 304.34; Lower = 174.78
Hardcover Holt: Upper = 307.43; Lower = 192.92
Paperback S.E.S: Upper = 275.35; Lower = 138.87
Paperback Holt: Upper = 275.02; Lower = 143.91
7.7 For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.
[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]
Which model gives the best RMSE?
##
## RMSE Holt's free to choose coefficients: 1.03
##
## RMSE Holt's damped=FALSE, alpha=0.5, beta=0.5, h=100: 1.39
##
## RMSE Holt's damped=TRUE, alpha=0.5, beta=0.5, h=100: 1.22
##
## RMSE Holt's damped=FALSE, alpha=0.5, beta=0.3, h=100: 1.3
##
## RMSE Holt's damped=TRUE, alpha=0.5, beta=0.3, h=100: 1.16
##
## RMSE Holt's damped=FALSE, alpha=0.7, beta=0.5, h=100: 1.27
##
## RMSE Holt's damped=TRUE, alpha=0.7, beta=0.5, h=100: 1.17
Surprisingly, damped forecasts showed the lowest RMSE even when apparently the forecast does not follow the overall trend of the historical data series.
7.8 Recall your retail time series data (from Exercise 3 in Section 2.10).
a. Why is multiplicative seasonality necessary for this series?
We need to use multiplicative seasonality because the variance of the series is not constant but instead increases over time.
b. Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
As we can see above, the Non-Damped Holt-Winters’ multiplicative method is the one that captures the increasing trend in the data series. Using the Damped option completely removes the trend
c. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
We calculated the cross-validated first step forward forecast of the Holt-Winters’ multiplicative method for the cases of:
- Not Damped: RMSE of 25.63
- Damped: RMSE of 25.8
For this data both damped and non-damped forecast output fairly similar RMSE values. The not-damped model shows a slightly lower RMSE but just based on this metric is not possible to say that is a better model than the damped forecast.
d. Check that the residuals from the best method look like white noise.
Both models show similar residual plot. Both models show the same similar hints of structure throughout the time range. Both residuals show what appears to be seasonality from the start of the series up to before 1990.
e. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?
We calculated a RMSE of 12.28 for the seasonal naïve forecast and a RMSE of 7.48 for Holt-Winters.
A simpler model is easier to explain that a more complicated model. However, in this case, the Holt-Winters forecast leads to a much more accurate model based on the RMSE metric.
7.9 For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
Let's compare the RMSE values from three methods:
- ETS: 0
- Seasonal Naïve: 12.28
- Holt-Winters: 7.48
The RMSE from the ETS forecast is so low that when rounded to two decimals appears as zero. A much lower value compared to either Seasonal Naïve or Holt-Winters forecasts.
Accuracy measures for a forecast model *https://pkg.robjhyndman.com/forecast/reference/accuracy.html*
R - Confused on Residual Terminology *https://stats.stackexchange.com/questions/110999/r-confused-on-residual-terminology*