Forecasting Principles and Practice

Load required packages

library(knitr)
library(fma)
library(tidyverse)
library(gridExtra)
library(openxlsx)

7.1

Answer

a. Let’s use ses() function in R to generate forecasts of next 4 months:

First let’s explore pigs series

str(pigs)
##  Time-Series [1:188] from 1980 to 1996: 76378 71947 33873 96428 105084 ...
findfrequency(pigs)
## [1] 3
tsdisplay(pigs)

As we see from the data description, pigs has 188 obs. depiciting no. of pigs slaughtered in Victoria each month and frequency of the dataset is 3.

Also from the time series plots, we see no particular trend or seasonality.

Before applying ses() to our time series data, let’s plot the graph using autoplot:

autoplot(pigs)

As we see from the above graph, there is no trend and no seasonality to be detected from the graph. Therefore, we will use simple exponential smoothing to forecast future values using a weighted average of all previous value sin the series.

Now, let’s use ses() to generate smoothing exponential and fit the model with h=4(months):

pigs_ses <- ses(pigs, h=4)
pigs_ses$model
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665

We see that alpha = 0.2971 (smoothing constant) which is closer to 0, which demonstrates that it’s slow learning because the algorithm gives historical data more weight; therefore, past changes in the data will have a bigger impact on forecasted values.

Let’s plot the ses() model:

autoplot(pigs_ses)

By ses() generated alpha:

Optimal values:

alpha = 0.2971 l = 77260.0561

Let’s analyze predictions for the next 4 months as mentioned in question:

predict(pigs_ses)
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

b.

  1. Calculating prediction interval using ses()
pigs_ses.prediction.interval = list(pigs_ses$lower[1,2], pigs_ses$upper[1,2])
unlist(pigs_ses.prediction.interval)
##       95%       95% 
##  78611.97 119020.84
  1. Calculating prediction interval using formula given in question:

First, let’s calculate standard deviation:

sd = sd(pigs_ses$residuals)
cat("Standard deviation of Residuals =", sd)
## Standard deviation of Residuals = 10273.69
lower = pigs_ses$mean[1] - 1.96 * sd
upper = pigs_ses$mean[1] + 1.96 * sd
calc.prediction.interval = list(lower, upper)
unlist(calc.prediction.interval)
## [1]  78679.97 118952.84

Lower limit of prediction interval is more for the one derived via formula and upper limit of prediction interval is higher than R’s lower limit.

7.5

Answer

a.

Plot the books series using tsdisplay()

autoplot(books)

As we see from the graph above paperback and hardcover has an increasing trend in their sales accompanied by dips which also confirms seasonality in the graph.

Let’s explore the dataset books

summary(books)
##    Paperback       Hardcover    
##  Min.   :111.0   Min.   :128.0  
##  1st Qu.:167.2   1st Qu.:170.5  
##  Median :189.0   Median :200.5  
##  Mean   :186.4   Mean   :198.8  
##  3rd Qu.:207.2   3rd Qu.:222.0  
##  Max.   :247.0   Max.   :283.0

As we see above, mon sales of paperback is 111 compared to 128 sales figure of hardcover.

and maximum sales of paperback is 247 compared to 283 sales figure of hardcover.

which shows that hardcover books sales are slightly more compared to paperback.

b.

Let’s split papercover and hardcover in the dataframe into two separate time series:

df = data.frame(books)

pc <- df$Paperback

hc <- df$Hardcover

Paperback

Now, let’s use ses() to generate smoothing exponential and fit the model:

pc_ses <- ses(pc, h = 4)

Now let’s explore model

pc_ses$model
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pc, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8271 
## 
##   sigma:  34.8183
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783

We see that alpha =0.1685 (smoothing constant) which is closer to 0, which demonstrates that it’s slow learning because the algorithm gives historical data more weight; therefore, past changes in the data will have a bigger impact on forecasted values.

Let’s plot the ses() model:

autoplot(pc_ses)

Let’s analyze predictions for the next 4 months as mentioned in question:

predict(pc_ses)
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1097 162.4882 251.7311 138.8670 275.3523
## 32       207.1097 161.8589 252.3604 137.9046 276.3147
## 33       207.1097 161.2382 252.9811 136.9554 277.2639
## 34       207.1097 160.6259 253.5935 136.0188 278.2005

The results above are forecasted sales value for paerback for next four days.

Hardcover

Now, let’s use ses() to generate smoothing exponential and fit the model:

hc_ses <- ses(hc, h = 4)

Now let’s explore model

hc_ses$model
## Simple exponential smoothing 
## 
## Call:
##  ses(y = hc, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2861 
## 
##   sigma:  33.0517
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542

We see that alpha =0.3283 (smoothing constant) which is closer to 0, which demonstrates that it’s slow learning because the algorithm gives historical data more weight; therefore, past changes in the data will have a bigger impact on forecasted values.

Let’s plot the ses() model:

autoplot(hc_ses)

Let’s analyze predictions for the next 4 months as mentioned in question:

predict(hc_ses)
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5601 197.2026 281.9176 174.7799 304.3403
## 32       239.5601 194.9788 284.1414 171.3788 307.7414
## 33       239.5601 192.8607 286.2595 168.1396 310.9806
## 34       239.5601 190.8347 288.2855 165.0410 314.0792

The results above are forecasted sales value for hardcover for next four days.

c.

Now, let’s calculate RMSE values for each:

pc.ses.acc = accuracy(pc_ses)
pc.ses.acc
##                    ME     RMSE     MAE       MPE     MAPE      MASE       ACF1
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303 -0.2117522
hc.ses.acc = accuracy(hc_ses)
hc.ses.acc
##                    ME     RMSE      MAE      MPE     MAPE      MASE       ACF1
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887 -0.1417763

Therefore, RMSE value of accuracy for paperback(33.63769) is more than hardcover(31.93101)

7.6

Answer

a.

Holt’s Method makes predictions for data with a trend using two smoothing parameters, α and β, which correspond to the level and trend components, respectively. For Holt’s method, the prediction will be a line of some non-zero slope that extends from the time step after the last collected data point onwards.

Let’s apply Holt’s linear method to paperback and hardcover each:

Paperback

pc_holt = holt(pc, h=4)
pc_holt$model
## Holt's method 
## 
## Call:
##  holt(y = pc, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 170.699 
##     b = 1.2621 
## 
##   sigma:  33.4464
## 
##      AIC     AICc      BIC 
## 318.3396 320.8396 325.3456
#pc_holt.acc = accuracy(pc_holt)
#pc_holt.acc

We see that alpha =1e-04 (smoothing constant) which is very close to 0, which demonstrates that it’s slow learning because the algorithm gives historical data more weight; therefore, past changes in the data will have a bigger impact on forecasted values.

Displaying the forecasted sales values of paperback using holt() method

predict(pc_holt)
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735

Hardcover

hc_holt = holt(hc, h=4)

hc_holt$model
## Holt's method 
## 
## Call:
##  holt(y = hc, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 147.7935 
##     b = 3.303 
## 
##   sigma:  29.2106
## 
##      AIC     AICc      BIC 
## 310.2148 312.7148 317.2208
#hc_holt.acc = accuracy(hc_holt)
#hc_holt.acc

We see that alpha =1e-04 (smoothing constant) which is very close to 0, which demonstrates that it’s slow learning because the algorithm gives historical data more weight; therefore, past changes in the data will have a bigger impact on forecasted values.

Displaying the forecasted sales values of paperback using holt() method

predict(hc_holt)
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334

b.

Computing RMSE values for paperback and handcover:

Paperback

pc_holt.acc = accuracy(pc_holt)
pc_holt.acc
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
##                    ACF1
## Training set -0.1750792

Hardcover

hc_holt.acc = accuracy(hc_holt)
hc_holt.acc
##                      ME     RMSE      MAE       MPE    MAPE      MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
##                     ACF1
## Training set -0.03245186

Tabulating the parameters of holt() and ses():

ses.rmse = list(pc.ses.acc[2], hc.ses.acc[2])
holt.rmse = list(pc_holt.acc[2], hc_holt.acc[2])
rmse = data.frame(Paperback=numeric(2), Hardcover=numeric(2))
rownames(rmse) = c("SES", "Holt")
rmse[,1] = unlist(ses.rmse)
rmse[,2] = unlist(holt.rmse)

# Compare RMSE for both series using both methods:
kable(rmse, caption = " RMSE for Books using SES and Holt:")
RMSE for Books using SES and Holt:
Paperback Hardcover
SES 33.63769 31.13692
Holt 31.93101 27.19358

As we see from above tabulation:

RMSE values for Paperback and Hardcover is low in Holt’s method.

c.

f1 = autoplot(pc_ses) +
    ggtitle("SES Forecasts") +
    xlab("Days") +
    ylab("Books") +
    autolayer(fitted(pc_ses), series="SES Paperback") +
    autolayer(fitted(hc_ses), series="SES Hardcover") + coord_fixed(ratio=1/4)


f2 = autoplot(pc_holt) +
    ggtitle("Holt Forecasts") +
    xlab("Days") +
    ylab("Books") +
    autolayer(fitted(pc_holt), series="Holt Paperback") +
    autolayer(fitted(hc_holt), series="Holt Hardcover") + coord_fixed(ratio=2/8)

grid.arrange(arrangeGrob(f1, f2, nrow=1, widths=c(1,1)))

Since as we see from the RMSE values, holt() method gave lowest values compared to ses(), holt() method forecasting seems to perform marginally better than ses().

d.

Let’s calculate a 95% prediction interval for the first forecast for each series

SES (Papeback)

print(paste('Prediction interval of SES (Papeback)', 'Lower:', round(pc_ses$lower[1, '95%'],2), 'Upper:', round(pc_ses$upper[1,'95%']),2))
## [1] "Prediction interval of SES (Papeback) Lower: 138.87 Upper: 275 2"

SES (Hardcover)

print(paste('Prediction interval of SES (Hardcover)','Lower:', round(hc_ses$lower[1, '95%'],2), 'Upper:', round(hc_ses$upper[1,'95%']),2))
## [1] "Prediction interval of SES (Hardcover) Lower: 174.78 Upper: 304 2"

RMSE SES (Paperback)

print(paste('Prediction interval using RMSE SES (Paperback)','Lower:', round(pc_ses$mean[1] - 1.96 * accuracy(pc_ses)[2],2), 'Upper:', round(pc_ses$mean[1] + 1.96 * accuracy(pc_ses)[2],2)))
## [1] "Prediction interval using RMSE SES (Paperback) Lower: 141.18 Upper: 273.04"

RMSE SES (Hardcover)

print(paste('Prediction interval using RMSE SES (Hardcover)', 'Lower:', round(hc_ses$mean[1] - 1.96 * accuracy(hc_ses)[2],2), 'Upper:', round(hc_ses$mean[1] + 1.96 * accuracy(hc_ses)[2],2)))
## [1] "Prediction interval using RMSE SES (Hardcover) Lower: 176.98 Upper: 302.14"

Holt (Papeback)

print(paste('Prediction interval of R Holt (Papeback)', 'Lower:', round(pc_holt$lower[1, '95%'],2), 'Upper:', round(pc_holt$upper[1,'95%']),2))
## [1] "Prediction interval of R Holt (Papeback) Lower: 143.91 Upper: 275 2"

Holt (Hardcover)

print(paste('Prediction interval of R Holt (Hardcover)', 'Lower:', round(hc_holt$lower[1, '95%'],2), 'Upper:', round(hc_holt$upper[1,'95%']),2))
## [1] "Prediction interval of R Holt (Hardcover) Lower: 192.92 Upper: 307 2"

RMSE Holt (Paperback)

print(paste('Prediction interval using RMSE Holt (Paperback)','Lower:', round(pc_holt$mean[1] - 1.96 * accuracy(pc_holt)[2],2), 'Upper:', round(pc_holt$mean[1] + 1.96 * accuracy(pc_holt)[2],2)))
## [1] "Prediction interval using RMSE Holt (Paperback) Lower: 148.44 Upper: 270.5"

RMSE Holt (Hardcover)

print(paste('Prediction interval using RMSE Holt (Hardcover)','Lower:', round(hc_holt$mean[1] - 1.96 * accuracy(hc_holt)[2],2), 'Upper:', round(hc_holt$mean[1] + 1.96 * accuracy(hc_holt)[2],2)))
## [1] "Prediction interval using RMSE Holt (Hardcover) Lower: 196.87 Upper: 303.47"

Prediction interval produced using R and RMSE for both the methods (SES and Holt) are not exactly same but very close

7.7

Answer

  1. Experiment with default holt() method
default <- holt(eggs, h=100)
  1. Experiment with damped holt() method
damped <- holt(eggs, h=100, damped = T)
  1. Experiment with exponential holt() method
exponential <- holt(eggs, h=100, exponential = T)
  1. Experiment with box cox method
lambda <- holt(eggs, h=100, lambda = 'auto', biasadj = T)

Now, let’s plot all the four plots above:

autoplot(eggs) +
  autolayer(default, series='Default', PI=F) +
  autolayer(damped, series='Damped', PI=F) +
  autolayer(exponential, series='Exponential', PI=F) +
  autolayer(lambda, series='Box-Cox Transformed', PI=F) +
  ggtitle('Forecast of US Eggs Prices') +
  xlab('Year') +
  ylab('Price of Dozen Eggs')

From the above plot, default holtz method has a straight line which shows us a negative trend of egg prives with increase in year.Damp method damps forecast line into straight horizontal line. Exponential metohd is close to box cox transformed prediction. Box cox and exponential depicts a slight decline of egg prices with increase in year.

We can conclude that US egg prices will decrease in the coming years, which can also be attributed to increase and improvement in poultry farming methods and technniques.

7.8

Answer

Load the retail dataset

file2 = "retail.xlsx"
retaildata <- read.xlsx(file2, sheet=1, startRow=2)
myts <- ts(retaildata[,"A3349873A"], frequency=12, start=c(1982,4))
ggseasonplot(myts)

a.

As we see from the graph above, retail data points are increasing yearly and also monthly every year. Also the change is seasonal and the seasonality is changing with increasing trend in time. Since from the graph above, seasonality is not constant, therefore multiplicative seasonality is the best approach to follow with because seasonal variations are not constant and additive method can handle constant seasonal variations.

b.

fit1 = hw(myts, seasonal="multiplicative", h=1)
fit2 = hw(myts, seasonal="multiplicative", damped=TRUE, h=1)

autoplot(myts) +
    autolayer(fit1, series="HW multiplicative forecasts", PI=FALSE) +
    autolayer(fit2, series="HW damped trend multiplicative forecasts", PI=FALSE) +
    xlab("Time") +
    ylab("Retail data") +
    guides(colour=guide_legend(title="Forecast"))

In the above graph we see Holt-Winters’ multiplicative forecasts and damped trend multiplicative forecasts.

c.

Now let’s compare the RMSE of the one-step forecasts from the two methods above:

Accuracy of undamped trend fit:

accuracy(fit1)
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 0.1170648 13.29378 8.991856 -0.1217735 3.918351 0.4748948
##                    ACF1
## Training set 0.08635577

Accuracy of damped trend fit:

accuracy(fit2)
##                    ME     RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 1.414869 13.30494 9.042151 0.6105987 3.959617 0.4775511 0.04077895

Accuracy of undamped fit has a lower RMSE. I would go with undamped trend since it has a lower RMSE and also from the above graph, undamped trend shows gives proper information about increasing trend than damped trend.

d.

Let’s check that the residuals from the best method look like white noise.

par(mfrow=c(1,2))
plot(fit1$residuals)

The residuals appear to be as white noise with spikes.

e.

train <- window(myts, end=c(2010, 12))
test <- window(myts, start=c(2011,1))

autoplot(myts) +
  autolayer(train, series="Training") +
  autolayer(test, series="Test") +
  ggtitle('Train-Test Split') +
  ylab('Turnover')

Above we see the plot of training and testing dataset which has increasing trend with time and also it has seasonality.

Test set RMSE for Seasonal naïve, Holt-Winter’s Multiplicative Method, Holt-Winter’s Additive Method with Box-Cox Transform approach

fit_snaive <- snaive(train, h=36)
fit1_hw <- hw(train, h=36, seasonal='multiplicative', damped=F)
fit2_hw <- hw(train, h=36, seasonal='additive', damped=F, lambda='auto')

Calculating RMSE’s

fit_snaive_acc<-accuracy(fit_snaive)[2]
fit1_hw_acc<-accuracy(fit1_hw)[2]
fit2_hw_acc<-accuracy(fit2_hw)[2]

Tabulating RMSE’s

df <- c(fit_snaive_acc, fit1_hw_acc, fit2_hw_acc)
names(df) <- c('Seasonal Naive Forecast', "Holt-Winter's Multiplicative Method", 
               "Holt-Winter's Additive Method with Box-Cox Transform")
df
##                              Seasonal Naive Forecast 
##                                            20.245762 
##                  Holt-Winter's Multiplicative Method 
##                                             9.107356 
## Holt-Winter's Additive Method with Box-Cox Transform 
##                                             8.889237

As seen above, the RMSE of the Holt-Winter’s Additive Method with Box-Cox Transform model is lower to that of the naive seasonal and Holt-Winter’s Multiplicative model.

7.9

Answer

Applying STL decomposition to box cox trsformation for retial series:

myts.stl.fit <- stlm(train, lambda="auto")
accuracy(myts.stl.fit)
##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set -0.646515 8.147517 5.855529 -0.3001347 2.885877 0.3669624
##                    ACF1
## Training set 0.02389584

Applying ETS model with multiplicative error(M), additive trend(A) and multiplicative seasonality(M) for retial series:

myts.ets.fit <- ets(train, model = "MAM")
autoplot(myts.ets.fit)

accuracy(myts.ets.fit)
##                      ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -0.3305082 9.010034 6.344188 -0.120261 3.114127 0.3975863
##                    ACF1
## Training set 0.05513695

RMSE’s for both the models above is lower than previous two models. Therefore, above two methods performed better compared to previous two methods.