Exercise 7.1

Consider the pigs series - the number of pigs slaughtered in Victoria each month.

summary(pigs)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   33873   79080   91662   90640  101493  120184
str(pigs)
##  Time-Series [1:188] from 1980 to 1996: 76378 71947 33873 96428 105084 ...
pigs 
##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 1980  76378  71947  33873  96428 105084  95741 110647 100331  94133 103055
## 1981  76889  81291  91643  96228 102736 100264 103491  97027  95240  91680
## 1982  76892  85773  95210  93771  98202  97906 100306  94089 102680  77919
## 1983  81225  88357 106175  91922 104114 109959  97880 105386  96479  97580
## 1984  90974  98981 107188  94177 115097 113696 114532 120110  93607 110925
## 1985 103069 103351 111331 106161 111590  99447 101987  85333  86970 100561
## 1986  82719  79498  74846  73819  77029  78446  86978  75878  69571  75722
## 1987  63292  59380  78332  72381  55971  69750  85472  70133  79125  85805
## 1988  69069  79556  88174  66698  72258  73445  76131  86082  75443  73969
## 1989  66269  73776  80034  70694  81823  75640  75540  82229  75345  77034
## 1990  75982  78074  77588  84100  97966  89051  93503  84747  74531  91900
## 1991  81022  78265  77271  85043  95418  79568 103283  95770  91297 101244
## 1992  93866  95171 100183 103926 102643 108387  97077  90901  90336  88732
## 1993  73292  78943  94399  92937  90130  91055 106062 103560 104075 101783
## 1994  82413  83534 109011  96499 102430 103002  91815  99067 110067 101599
## 1995  88905  89936 106723  84307 114896 106749  87892 100506              
##         Nov    Dec
## 1980  90595 101457
## 1981 101259 109564
## 1982  93561 117062
## 1983 109490 110191
## 1984 103312 120184
## 1985  89543  89265
## 1986  64182  77357
## 1987  81778  86852
## 1988  78139  78646
## 1989  78589  79769
## 1990  81635  89797
## 1991 114525 101139
## 1992  83759  99267
## 1993  93791 102313
## 1994  97646 104930
## 1995

This dataset contains the dataset from 1980 to 1996 distributed monthly for each year. Minimum and maximum pigs slaughtered are 33873 and 120184 respectively on a monthly basis.

(a)

Use the ses() function in R to find the optimal values of α and ℓ0, and generate forecasts for the next four months

ses_pigs <- ses(pigs, 4)
summary(ses_pigs)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665 
## 
## Error measures:
##                    ME    RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

The optimal values for alpha is 0.2971 and for l is 77260.0561.

autoplot(ses_pigs)+autolayer(fitted(ses_pigs), series = "Fitted") + labs(title="Forecasting - # of Pigs Slaughtered through Simple Exponential Smoothing")

Forecasted area shows the intervals for the number of pigs to be slaughtered for that time.

(b)

Compute a 95% prediction interval for the first forecast using y ± 1.96s where s is the standard deviation of the residuals. The smaller value of alpha means that the fitted prediction is smoother than the original data. Let’s see how it looks like on plot.

# Using R
print(paste0("The 95% prediction interval using R for is ", ses_pigs$upper[1,2], " and ",  ses_pigs$lower[1,2]))
## [1] "The 95% prediction interval using R for is 119020.843765435 and 78611.9684577467"

Now we will calculate 95% confidence interval using the formula:

# Using formula
s <- sd(residuals(ses_pigs))
print(paste0("The 95% confidence interval using formula is ",ses_pigs$mean[1] + 1.96*s, " and ", ses_pigs$mean[1] - 1.96*s))
## [1] "The 95% confidence interval using formula is 118952.844969765 and 78679.9672534162"

Seems like there slight variation for confidence intervals using R and formula. The interval using R is comparatively narrower than formula as R calculated slighly different as mentioned in Hyndman’s book.

Exercise 7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

(a)

Plot the series and discuss the main features of the data.

books
## Time Series:
## Start = 1 
## End = 30 
## Frequency = 1 
##    Paperback Hardcover
##  1       199       139
##  2       172       128
##  3       111       172
##  4       209       139
##  5       161       191
##  6       119       168
##  7       195       170
##  8       195       145
##  9       131       184
## 10       183       135
## 11       143       218
## 12       141       198
## 13       168       230
## 14       201       222
## 15       155       206
## 16       243       240
## 17       225       189
## 18       167       222
## 19       237       158
## 20       202       178
## 21       186       217
## 22       176       261
## 23       232       238
## 24       195       240
## 25       190       214
## 26       182       200
## 27       222       201
## 28       217       283
## 29       188       220
## 30       247       259
summary(books)
##    Paperback       Hardcover    
##  Min.   :111.0   Min.   :128.0  
##  1st Qu.:167.2   1st Qu.:170.5  
##  Median :189.0   Median :200.5  
##  Mean   :186.4   Mean   :198.8  
##  3rd Qu.:207.2   3rd Qu.:222.0  
##  Max.   :247.0   Max.   :283.0
autoplot(books)

The books dataset consists of 30 days sales data for paperback and hardcover. Minimum sales for paperback is 111 and maximum sale for a day is 247 with average of 186.4. For hardcover, minimum sales is 128 and maximum is 283 with average of 198 in a month. Overall, the mean, median and maximum sales for hardcover is larger than paperback as we can see in the plot as well.

(b)

Use the ses() function to forecast each series, and plot the forecasts.

# Isolating the data for paperback and hardcover
paperback <- ses(books[,'Paperback'], h=4) # h is predicted interval which is 4 days here
hardcover <- ses(books[,'Hardcover'], h=4)

# Calculating the forecast using ses function
summary(paperback)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = books[, "Paperback"], h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8271 
## 
##   sigma:  34.8183
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783 
## 
## Error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE       ACF1
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303 -0.2117522
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1097 162.4882 251.7311 138.8670 275.3523
## 32       207.1097 161.8589 252.3604 137.9046 276.3147
## 33       207.1097 161.2382 252.9811 136.9554 277.2639
## 34       207.1097 160.6259 253.5935 136.0188 278.2005
summary(hardcover)
## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = books[, "Hardcover"], h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2861 
## 
##   sigma:  33.0517
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542 
## 
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE       ACF1
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887 -0.1417763
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5601 197.2026 281.9176 174.7799 304.3403
## 32       239.5601 194.9788 284.1414 171.3788 307.7414
## 33       239.5601 192.8607 286.2595 168.1396 310.9806
## 34       239.5601 190.8347 288.2855 165.0410 314.0792
# Creating plot
autoplot(books) + 
  autolayer(paperback, series="Paperback", PI=FALSE)+
  autolayer(hardcover, series="Hardcover", PI=FALSE)

alpha for Paperback (0.1685) is smaller than Hardcover’s alpha (0.3253) which tells that there will be smoother plot for Paperback series but l for hardcover is smaller which shows that any small fluctuation in data can change the smoothness for paperback.

(c)

Compute the RMSE values for the training data in each case.

Assuming the books dataset as a training set, below I’ll calculate RMSE for HardCover and Paperback series.

# RMSE 
print(paste0("RMSE for paperback series is ", sqrt(mean(paperback$residuals^2)), " and the RMSE for Hardcover series is ", sqrt(mean(hardcover$residuals^2))))
## [1] "RMSE for paperback series is 33.637686782912 and the RMSE for Hardcover series is 31.9310149844547"
# Double checking
accuracy(paperback)
##                    ME     RMSE     MAE       MPE     MAPE      MASE       ACF1
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303 -0.2117522
accuracy(hardcover)
##                    ME     RMSE      MAE      MPE     MAPE      MASE       ACF1
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887 -0.1417763

As verified through accuracy function on each series, RMSE for paperback series is 33.637 and RMSE for hardcover is 31.931 which means that overall comparatively there is less error in hardcover as compared with paperback. Above plot also showed slight instability in forecasting due to higher value of l.

Exercise 7.6

We will continue with the daily sales of paperback and hardcover books in dataset books.

(a)

Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

# paperback
paperback_holt <- holt(books[, 'Paperback'],h=4)
forecast(paperback_holt)
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735
# hardcover
hardcover_holt <- holt(books[, 'Hardcover'], h=4)
forecast(hardcover_holt)
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334
# Plotting both series
autoplot(books) +
  autolayer(paperback_holt, series="Paperback", PI=FALSE)+
  autolayer(hardcover_holt, series="Hardcover", PI=FALSE)

(b)

Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

# Calculating RMSE for both series
accuracy(paperback_holt)
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
##                    ACF1
## Training set -0.1750792
accuracy(hardcover_holt)
##                      ME     RMSE      MAE       MPE    MAPE      MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
##                     ACF1
## Training set -0.03245186

Seems like previously for paperback series it was 33.637 which is improved down to 31.136. Hardcover series’ RMSE has also improved from 31.931 to 27.193. We can absolutely say that Holt’s method has improved the prediction which we can see with RMSE values as compared with RMSE’ values with Simple Exponential Smoothing.

(c)

Compare the forecasts for the two series using both methods. Which do you think is better?

# Paperback
autoplot(books[,1])+
  autolayer(paperback, series="Simple Exponential Smoothing", PI=FALSE)+
  autolayer(paperback_holt, series= 'Holt Method', PI=FALSE) + labs(title="Paperback' Sales: Exponential Smoothing Vs Holt Method")

# Hardcover
autoplot(books[,2])+
  autolayer(hardcover, series="Simple Exponential Smoothing", PI=FALSE)+
  autolayer(hardcover_holt, series='Hold Method', PI=FALSE) + labs(title="Hardcover' Sales: Exponential Smoothing Vs Holt Method")

According to values of RMSE and predicted graph above, we can say that Holt method is better as compared with simple exponential smoothing. Simple method is simply predicting constant values.

Exercise 7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900 - 1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop and intuition of what each argument is doing to the forecasts.

Here, I’m going to create autoplot with different functions altogether to see their results.

main <- holt(eggs, h=100)
damped <- holt(eggs, damped=TRUE, h=100)
exponential <- holt(eggs, exponential = TRUE, h=100)
lambda <- holt(eggs, lambda='auto', damped=TRUE, h=100)
damped_exp <- holt(eggs, exponential = TRUE, damped = TRUE, h=100)
damped_boxcox <- holt(eggs, lambda='auto', damped=TRUE, biasadj = TRUE, h=100) # Boxcox transformation

# Creating plot
autoplot(eggs)+
  autolayer(main, series='Basic', PI=FALSE)+
  autolayer(damped, series='damped', PI=FALSE)+
  autolayer(exponential, series='exponential', PI=FALSE)+
  autolayer(lambda, series='lambda', PI=FALSE)+
  autolayer(damped_exp, series='damped and exponential', PI=FALSE)+
  autolayer(damped_boxcox, series='damped and boxcox transformed', PI=FALSE)+
  labs(title="Forecast - Different Methods")

Above graph is very surprising. Different methods have different prediction all the way from negative price to almost 70. I am going to check RMSE values for the above methods to see which one has less error but basic model is definitely not doing good as the price cannot be less than 0.

Now let’s create accuracy table for all models to see which model has less errors.

accuracy_df <- rbind(accuracy(main), accuracy(damped), accuracy(exponential), accuracy(lambda), accuracy(damped_exp), accuracy(damped_boxcox))
rownames(accuracy_df) <- c('Basic', 'Damped', 'Exponential', 'Lambda', 'Damped and Exponential', 'Damped and Box-Cox Transformed')
accuracy_df %>% kable() %>% kable_styling()
ME RMSE MAE MPE MAPE MASE ACF1
Basic 0.0449909 26.58219 19.18491 -1.142201 9.653791 0.9463626 0.0134820
Damped -2.8914955 26.54019 19.27950 -2.907633 10.018944 0.9510287 -0.0031954
Exponential 0.4918791 26.49795 19.29399 -1.263235 9.766049 0.9517436 0.0103908
Lambda -0.8200445 26.53321 19.45654 -2.019718 9.976131 0.9597618 0.0058524
Damped and Exponential -0.9089678 26.59113 19.54973 -2.125756 10.023283 0.9643590 0.0137612
Damped and Box-Cox Transformed -1.8062134 26.58589 19.55896 -2.584250 10.117605 0.9648141 0.0053221

According to above results, we can say that Exponential method is doing good in terms of RMSE i.e. 26.49795.

Exercise 7.8

Recall your retail time series data (from Exercise 3 in Section 2.10)

retail <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retail[,"A3349873A"],
  frequency=12, start=c(1982,4))

(a)

Why is multiplicative seasonality necessary for this series?

autoplot(myts)

Multiplicative seasonality is used and preferred when the seasonal variations are changing to the level of series. We can see that in the above graph.

(b)

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

# Holt-Winters' Multiplicative method
hw_multip <- hw(myts, seasonal = 'multiplicative')
autoplot(hw_multip)

# Hold-winter's method and damped
hw_multip_damped <- hw(myts, seasonal='multiplicative', damped=TRUE)
autoplot(hw_multip_damped)

With damped method, predictive interval has shrunk. We cannot say for sure about performance until we double check until later.

(c)

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

# Checking for accuracy
accuracy(hw_multip)
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 0.1170648 13.29378 8.991856 -0.1217735 3.918351 0.4748948
##                    ACF1
## Training set 0.08635577
accuracy(hw_multip_damped)
##                    ME     RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 1.414869 13.30494 9.042151 0.6105987 3.959617 0.4775511 0.04077895

Result is very much similar if you take a look at RMSE for both methods but comparatively basic Holt-Winters’ Multiplicative Method is slightly performing better than damped version.

(d)

Check that the residuals from the best method look like white noise.

checkresiduals(hw_multip)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method
## Q* = 40.405, df = 8, p-value = 2.692e-06
## 
## Model df: 16.   Total lags used: 24

Histogram for residuals show that the data is normal. According to Ljung-box test, there is autocorrelation in the data.

(e)

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naive approach from Exercise 8 in section 3.7?

We are going to fit the training set with 3 methods which are seasonal naive, holt-winter’s multiplicative and holt-winter’s additive trend with boxcox transformation.

I’ll start by splitting the training and test datasets as per the instructions and create plot

# Training and test datasets
training <- window(myts, end=c(2010,12))
test <- window(myts, start=c(2011,1))

# Plot
autoplot(myts)+
  autolayer(training, series="Training")+
  autolayer(test, series="Test")

# Plotting the test set to check the model
fit_snaive <- snaive(training, h=36)
fit1_multip <- hw(training, h=36, seasonal='multiplicative', damped=F)
fit2_additive <- hw(training, h=36, seasonal='additive', damped=F, lambda='auto')

autoplot(test, series='Basic')+
  autolayer(fit_snaive, series="Seasonal Naive",PI=FALSE)+
  autolayer(fit1_multip, series="Holt-Winters' Multipicative", PI=FALSE)+
  autolayer(fit2_additive, series="Holt-Winters' Additive", PI=FALSE)

Now let’s see how RMSE values are doing.

library(Metrics)
## 
## Attaching package: 'Metrics'
## The following object is masked from 'package:forecast':
## 
##     accuracy
test_rmse <- c(rmse(test, fit_snaive$mean), rmse(test, fit1_multip$mean), rmse(test, fit2_additive$mean))
names(test_rmse) <- c("Seasonal Naive", "Holt Winter Multiplicative", "Holt-Winter Additive")
test_rmse %>% kable() %>% kable_styling()
x
Seasonal Naive 100.00869
Holt Winter Multiplicative 94.80662
Holt-Winter Additive 99.21057

Holt-Winter’s Multiplicative method is performing better as compared with others because RMSE value is lower than rest of them.

Exercise 7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

l <- BoxCox.lambda(training)
fit_stl <- stlf(training, lambda=l)
fit_ets <- ets(seasadj(decompose(training, "multiplicative")))

# Plot
autoplot(training, series="train") +
  autolayer(forecast(fit_stl, h = 24, PI=F), series = "STL Forecast") +
  autolayer(forecast(fit_ets, h = 24, PI=F), series = "ETS Forcast") +
  autolayer(test, series = "test")