Data624 - Homework 5 - Exponential Smoothing

library(kableExtra)
library(corrplot)
library(fpp2)
library(plotly)
library(gridExtra)
library(readxl)

Exercise 7.1

Consider the pigs series — the number of pigs slaughtered in Victoria each month.

Monthly total number of pigs slaughtered in Victoria, Australia (Jan 1980 – Aug 1995).

#using library(fpp2)

#help("pigs")
head(pigs)

##         Jan    Feb    Mar    Apr    May    Jun
## 1980  76378  71947  33873  96428 105084  95741

summary(pigs)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   33873   79080   91662   90640  101493  120184

#tsdisplay(pigs)

paste0("Frequency: ", frequency(pigs))

## [1] "Frequency: 12"

autoplot(pigs)

ggseasonplot(pigs)

ggsubseriesplot(pigs)

gglagplot(pigs)

Acf(pigs, lag.max=150)

While we do see some presence of some cyclic components, it does not seems the data has any seasonality or trending components.

Use the ses() function in R to find the optimal values of α and ℓ0 , and generate forecasts for the next four months.

Using SES (simple exponential smoothing) function to explore exponential smoothing in our pigs dataset

ses_pigs_4 <- ses(pigs, h = 4)
summary(ses_pigs_4)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665 
## 
## Error measures:
##                    ME    RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

The use of ses function gives us alpha = 0.2971 (Smoothing parameters). Optimal values: \(\alpha\) = 0.2971 and \(l_{0}\) = 77260.0561.

Compute a 95% prediction interval for the first forecast using \(\hat{y}\) ± 1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

ses_pigs <- ses(pigs)

autoplot(ses_pigs) +
  autolayer(fitted(ses_pigs), series="Fitted")

s <- sd(ses_pigs$residuals)

ub <- round((ses_pigs$mean[1] + 1.96*s), 2)

lb <- round((ses_pigs$mean[1] - 1.96*s), 2)

Interval Bounds:

print(paste0('Lower bound CI : ', lb))

## [1] "Lower bound CI : 78679.97"

print(paste0('Upper bound CI : ', ub))

## [1] "Upper bound CI : 118952.84"

#It is good to know that R already has function that can compute these for you, I'm not going to use them here.
#R_lb <- round(ses(pigs, h = 4, level = 95)$lower[1], 2)
#R_ub <- round(ses(pigs, h = 4, level = 95)$upper[1], 2)

Looking at the values and faorecasts plotted, we can see that these results are little different than the results computed by the ses() function.

Exercise 7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

Daily sales of paperback and hardcover books at the same store.

#help("books")
data("books")
summary(books)

##    Paperback       Hardcover    
##  Min.   :111.0   Min.   :128.0  
##  1st Qu.:167.2   1st Qu.:170.5  
##  Median :189.0   Median :200.5  
##  Mean   :186.4   Mean   :198.8  
##  3rd Qu.:207.2   3rd Qu.:222.0  
##  Max.   :247.0   Max.   :283.0

head(books)

## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
##   Paperback Hardcover
## 1       199       139
## 2       172       128
## 3       111       172
## 4       209       139
## 5       161       191
## 6       119       168

Plot the series and discuss the main features of the data

autoplot(books) + ggtitle("Daily Book Sales") + xlab("Day") + ylab("Books Sales")

Paperback

books[,"Paperback"]

## Time Series:
## Start = 1 
## End = 30 
## Frequency = 1 
##  [1] 199 172 111 209 161 119 195 195 131 183 143 141 168 201 155 243 225 167 237
## [20] 202 186 176 232 195 190 182 222 217 188 247

autoplot(books[,"Paperback"]) + ggtitle("Plot of Paperback")

gglagplot(books[,"Paperback"])

Acf(books[,"Paperback"], lag.max=150)

Paperbacks don’t seem to have cyclic or seasonal components to them. Paperbacks seems to be upward trending.The ACF points are within the dotted blue lines suggesting that there is unlikely to be correlations within the autocorrelations.

Hardcover

books[,"Hardcover"]

## Time Series:
## Start = 1 
## End = 30 
## Frequency = 1 
##  [1] 139 128 172 139 191 168 170 145 184 135 218 198 230 222 206 240 189 222 158
## [20] 178 217 261 238 240 214 200 201 283 220 259

autoplot(books[,"Hardcover"]) + ggtitle("Hardcover Plot")

gglagplot(books[,"Hardcover"])

Acf(books[,"Hardcover"], lag.max=150)

Haradcovers do not have any seasonal or cyclical components to them either. There seem to have upward trending.

Use the ses() function to forecast each series, and plot the forecasts.

ses.paperback <- ses(books[, "Paperback"])
ses.hardcover <- ses(books[, "Hardcover"])

autoplot(books[, "Paperback"], series = "Paperback") +
  autolayer(ses.paperback, series = "Paperback") +
  autolayer(books[, "Hardcover"], series = "Hardcover") +
  autolayer(ses.hardcover, series = "Hardcover", PI = FALSE) +
  ylab("Sales Amount") +
  ggtitle("SES Books Sales")

Compute the RMSE values for the training data in each case

# RMSE for paperback
ses.paperback.rmse <- sqrt(mean(ses.paperback$residuals^2))
ses.paperback.rmse

## [1] 33.63769

# RMSE for hardcover
ses.hardcover.rmse <- sqrt(mean(ses.hardcover$residuals^2))
ses.hardcover.rmse

## [1] 31.93101

RMSE: hardcover(sales) < paperback(sales) Paperback variance residuals is greater than that of hardcover

Exercise 7.6

Use same book dataset from above question:

Now apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

holt.paperback <- holt(books[, "Paperback"], h = 4)
holt.hardcover <- holt(books[, "Hardcover"], h = 4)

autoplot(books[, "Paperback"], series = "Paperback") +
  autolayer(holt.paperback, series = "Paperback") +
  autolayer(books[, "Hardcover"], series = "Hardcover") +
  autolayer(holt.hardcover, series = "Hardcover", PI = FALSE) +
  ylab("Sales Amount") +
  ggtitle("Holt: Books Sales")

Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

holt.paperback.rmse <- sqrt(mean(holt.paperback$residuals^2))
holt.paperback.rmse

## [1] 31.13692

holt.hardcover.rmse <- sqrt(mean(holt.hardcover$residuals^2))
holt.hardcover.rmse

## [1] 27.19358

RMSE Holt value is lower than that of ses() method. Because it has low RMSE values, Holt’s method may be more suitable for more linear datasets.

Compare the forecasts for the two series using both methods. Which do you think is best?

Holt’s method appear to be better choice for more linear dataset as it has low RMSE values. But for better results, one must split the dataset into training and test datasets using cross validation techniques.

Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

forecast(ses.paperback, h=1)

##    Point Forecast    Lo 80    Hi 80   Lo 95    Hi 95
## 31       207.1097 162.4882 251.7311 138.867 275.3523

forecast(holt.paperback, h=1)

##    Point Forecast    Lo 80    Hi 80   Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.913 275.0205

The 95% confidence interval for SES Paperback: 138.86 - 275.35

The 95% confidence interval for Holt Paperback: 143.91 - 275.02

The 95% confidence interval is more narrow in the Holt method compared to the SES.

forecast(ses.hardcover, h=1)

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5601 197.2026 281.9176 174.7799 304.3403

forecast(holt.hardcover, h=1)

##    Point Forecast   Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.739 287.6087 192.9222 307.4256

The 95% confidence interval for SES Hardcover: 174.77 - 304.34

The 95% confidence interval for Holt Hardcover: 192.92 - 307.42

The 95% confidence interval is more narrow in the Holt method compared to the SES.

#SES method
print(paste0("SES method of interval: ", "(", ses(books[, "Hardcover"], level = 95, h = 4)$lower[1], ", ", ses(books[, "Hardcover"], level = 95, h = 4)$upper[1], ")"))

## [1] "SES method of interval: (174.779870851487, 304.340313152036)"

#Holt's method
print(paste0("SES method of interval: ", "(", holt(books[, "Hardcover"], level = 95, h = 4)$lower[1], ", ", holt(books[, "Hardcover"], level = 95, h = 4)$upper[1], ")"))

## [1] "SES method of interval: (192.922170771941, 307.42557365277)"

Exercise 7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts. [Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Price of dozen eggs in US, 1900–1993, in constant dollars.

data("eggs")

## Warning in data("eggs"): data set 'eggs' not found

head(eggs)

## Time Series:
## Start = 1900 
## End = 1905 
## Frequency = 1 
## [1] 276.79 315.42 314.87 321.25 314.54 317.92

summary(eggs)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   62.27  148.87  209.15  206.15  276.71  358.78

autoplot(eggs)

gglagplot(eggs)

Acf(eggs)

#RMSE Holt function
holt.eggs <- holt(eggs, h = 100)
autoplot(holt.eggs) + autolayer(holt.eggs$fitted)

#RMSE Holt function with damped option
holt.damped.eggs <- holt(eggs, damped = TRUE, h = 100)
autoplot(holt.damped.eggs) + autolayer(holt.damped.eggs$fitted)

#RMSE Holt function with Box-Cox transformation
holt.BoxCox.eggs <- holt(eggs, lambda = BoxCox.lambda(eggs), h = 100)
autoplot(holt.BoxCox.eggs) + autolayer(holt.BoxCox.eggs$fitted)

#RMSE Holt function with damped option and Box-Cox transformation
holt.BoxCox.damped.eggs <- holt(eggs, damped = TRUE, lambda = BoxCox.lambda(eggs), h = 100)
autoplot(holt.BoxCox.damped.eggs) + autolayer(holt.BoxCox.damped.eggs$fitted)

calculating RMSEs:

paste0("RMSE Holt function = ", sqrt(mean(holt.eggs$residuals^2)))

## [1] "RMSE Holt function = 26.5821881569903"

paste0("RMSE Holt function with damped option = ", sqrt(mean(holt.damped.eggs$residuals^2)))

## [1] "RMSE Holt function with damped option = 26.5401852798325"

paste0("RMSE Holt function with Box-Cox transformation = ", sqrt(mean(holt.BoxCox.eggs$residuals^2)))

## [1] "RMSE Holt function with Box-Cox transformation = 1.03221710764543"

paste0("RMSE Holt function with damped option and Box-Cox transformation = ", sqrt(mean(holt.BoxCox.damped.eggs$residuals^2)))

## [1] "RMSE Holt function with damped option and Box-Cox transformation = 1.03918723939767"

Exercise 7.8

Recall your retail time series data (from Exercise 3 in Section 2.10).

Why is multiplicative seasonality necessary for this series?

retaildata = readxl::read_excel("retail.xlsx", skip = 1)
myts = ts(retaildata[, 15], frequency = 12, start = c(1982,4))

autoplot(myts) + ggtitle("Retail Data Time Series")

autoplot(decompose(myts,type="multiplicative"))

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

holt.fit <- hw(myts, seasonal="multiplicative",damped=FALSE)
summary(holt.fit)

## 
## Forecast method: Holt-Winters' multiplicative method
## 
## Model Information:
## Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = myts, seasonal = "multiplicative", damped = FALSE) 
## 
##   Smoothing parameters:
##     alpha = 0.4775 
##     beta  = 1e-04 
##     gamma = 0.3913 
## 
##   Initial states:
##     l = 23.21 
##     b = 0.3815 
##     s = 0.8539 0.7454 0.7508 1.3732 1.0659 1.0346
##            1.0921 1.0709 1.067 1.0163 1.0322 0.8977
## 
##   sigma:  0.0621
## 
##      AIC     AICc      BIC 
## 3271.564 3273.250 3338.591 
## 
## Error measures:
##                       ME     RMSE      MAE        MPE     MAPE      MASE
## Training set -0.09292851 4.512297 3.198521 -0.7408265 4.915305 0.5801485
##                    ACF1
## Training set 0.08651428
## 
## Forecasts:
##          Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
## Jan 2014       132.5906 122.04284 143.1383 116.45922 148.7219
## Feb 2014       101.4648  92.52133 110.4082  87.78695 115.1426
## Mar 2014       109.2479  98.76688 119.7289  93.21858 125.2772
## Apr 2014       118.6365 106.40320 130.8698  99.92728 137.3457
## May 2014       107.8318  95.99198 119.6715  89.72438 125.9391
## Jun 2014       105.3891  93.15637 117.6218  86.68076 124.0974
## Jul 2014       115.2966 101.23049 129.3628  93.78432 136.8090
## Aug 2014       119.9092 104.60524 135.2132  96.50380 143.3147
## Sep 2014       125.8771 109.13540 142.6189 100.27287 151.4814
## Oct 2014       138.2408 119.14392 157.3378 109.03463 167.4471
## Nov 2014       161.6416 138.51387 184.7693 126.27080 197.0123
## Dec 2014       243.6270 207.61046 279.6435 188.54447 298.7095
## Jan 2015       137.1147 114.77929 159.4500 102.95567 171.2736
## Feb 2015       104.9174  87.38951 122.4453  78.11079 131.7241
## Mar 2015       112.9553  93.62694 132.2837  83.39511 142.5155
## Apr 2015       122.6517 101.18051 144.1228  89.81438 155.4889
## May 2015       111.4714  91.52927 131.4135  80.97254 141.9703
## Jun 2015       108.9368  89.03994 128.8336  78.50720 139.3663
## Jul 2015       119.1674  96.96616 141.3686  85.21353 153.1213
## Aug 2015       123.9241 100.39379 147.4544  87.93759 159.9106
## Sep 2015       130.0806 104.92668 155.2345  91.61101 168.5502
## Oct 2015       142.8449 114.73416 170.9557  99.85323 185.8366
## Nov 2015       167.0107 133.58489 200.4366 115.89032 218.1311
## Dec 2015       251.6980 200.49641 302.8996 173.39193 330.0041

autoplot(holt.fit) + ggtitle("Holt Winter Multiplicative Fit ")

holt.fit.damped <- hw(myts, seasonal="multiplicative",damped=TRUE)
summary(holt.fit.damped)

## 
## Forecast method: Damped Holt-Winters' multiplicative method
## 
## Model Information:
## Damped Holt-Winters' multiplicative method 
## 
## Call:
##  hw(y = myts, seasonal = "multiplicative", damped = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.508 
##     beta  = 0.004 
##     gamma = 0.3974 
##     phi   = 0.98 
## 
##   Initial states:
##     l = 22.937 
##     b = -0.0448 
##     s = 0.867 0.7254 0.7038 1.3907 1.0724 1.0083
##            1.0667 1.0868 1.0984 1.0545 0.9898 0.9363
## 
##   sigma:  0.0651
## 
##      AIC     AICc      BIC 
## 3299.520 3301.410 3370.491 
## 
## Error measures:
##                     ME     RMSE      MAE       MPE    MAPE     MASE       ACF1
## Training set 0.4807647 4.561898 3.222506 0.4081738 4.91363 0.584499 0.06759754
## 
## Forecasts:
##          Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
## Jan 2014       132.9199 121.83740 144.0025 115.97067 149.8692
## Feb 2014       101.1642  91.68679 110.6417  86.66973 115.6587
## Mar 2014       108.6758  97.47294 119.8786  91.54253 125.8090
## Apr 2014       118.2424 105.02197 131.4629  98.02347 138.4614
## May 2014       107.2524  94.38195 120.1227  87.56878 126.9359
## Jun 2014       104.7407  91.35856 118.1229  84.27447 125.2070
## Jul 2014       114.7360  99.22687 130.2450  91.01685 138.4551
## Aug 2014       119.3381 102.35889 136.3173  93.37064 145.3055
## Sep 2014       124.8242 106.20970 143.4386  96.35581 153.2925
## Oct 2014       137.0401 115.69649 158.3837 104.39786 169.6823
## Nov 2014       159.9474 134.00861 185.8861 120.27745 199.6173
## Dec 2014       239.8288 199.43737 280.2202 178.05545 301.6021
## Jan 2015       135.0787 109.94932 160.2080  96.64665 173.5107
## Feb 2015       102.7741  83.07855 122.4696  72.65237 132.8958
## Mar 2015       110.3703  88.61183 132.1289  77.09357 143.6471
## Apr 2015       120.0492  95.73342 144.3650  82.86142 157.2370
## May 2015       108.8584  86.23014 131.4866  74.25146 143.4653
## Jun 2015       106.2778  83.62922 128.9264  71.63977 140.9159
## Jul 2015       116.3862  90.98255 141.7899  77.53465 155.2378
## Aug 2015       121.0205  93.98905 148.0519  79.67947 162.3615
## Sep 2015       126.5490  97.64680 155.4513  82.34688 170.7512
## Oct 2015       138.8964 106.48491 171.3078  89.32732 188.4654
## Nov 2015       162.0713 123.45728 200.6852 103.01627 221.1262
## Dec 2015       242.9508 183.88942 302.0122 152.62422 333.2774

autoplot(holt.fit.damped) + ggtitle("Holt Winter Multiplicative Fit with Damped Option")

When comparing RMSE, AIC, BIC, the two different methods are not very different. The damped method may provide a trivial improvement over the non-damped method.

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

error.retail <- tsCV(myts, hw, h = 1, seasonal = "multiplicative")
error.retail.damped <- tsCV(myts, hw, h = 1, seasonal = "multiplicative", damped = TRUE)

sqrt(mean(error.retail^2, na.rm = TRUE))

## [1] 4.805684

sqrt(mean(error.retail.damped^2, na.rm = TRUE))

## [1] 4.965358

RMSE values alomst same. Damped model is recommended because it will prohibit the limitless increase of sales forecast.

Check that the residuals from the best method look like white noise.

checkresiduals(holt.fit.damped)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 75.911, df = 7, p-value = 9.37e-14
## 
## Model df: 17.   Total lags used: 24

Ljung-Box test result and ACF plot show that the residuals aren’t white noise.

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

myts.train <- window(myts, end = c(2010, 12))
myts.test <- window(myts, start = 2011)

myts.train.damped <- hw(myts.train, h = 36, seasonal = "multiplicative", damped = TRUE)
autoplot(myts.train.damped)

accuracy(myts.train.damped, myts.test)

##                     ME      RMSE      MAE       MPE     MAPE      MASE
## Training set 0.3426235  3.788172 2.760286 0.1965206 4.619317 0.5636346
## Test set     3.3249030 12.332173 9.224458 1.8191543 8.033174 1.8835814
##                   ACF1 Theil's U
## Training set 0.1406912        NA
## Test set     0.4340217 0.4901811

myts.train.nd <- hw(myts.train, h = 36, seasonal = "multiplicative")
autoplot(myts.train.nd)

accuracy(myts.train.nd, myts.test)

##                       ME      RMSE      MAE        MPE     MAPE      MASE
## Training set -0.08201523  3.744601 2.729380 -0.5747145 4.609611 0.5573237
## Test set     -1.26833947 10.870610 8.061112 -2.1884614 7.449230 1.6460328
##                   ACF1 Theil's U
## Training set 0.1427398        NA
## Test set     0.2723417 0.4721114

Could not beat seasonal naive approach with Holt-Winters damped option. Better accuracy received with damped option but still couldn’t beat the seasonal naive approach.

Exercise 7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

# Look for lambda
library(dplyr)

## 
## Attaching package: 'dplyr'

## The following object is masked from 'package:gridExtra':
## 
##     combine

## The following object is masked from 'package:kableExtra':
## 
##     group_rows

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

lambda <- BoxCox.lambda(myts.train)

# Box-Cox-STLF transformation
stlf_boxcox_myts.train <- stlf(myts.train, lambda = BoxCox.lambda(myts.train))

#ETS transformation on seasonally
ets_myts.train <- ets(seasadj(decompose(myts.train, "multiplicative")))

Compare them

# Box-Cox-STLF transformation
accuracy(stlf_boxcox_myts.train, myts.test)

##                      ME      RMSE       MAE        MPE      MAPE      MASE
## Training set  -0.029466  2.993223  2.212575  -0.107248  3.677062 0.4517952
## Test set     -10.727124 16.213436 11.932970 -10.811284 11.844987 2.4366439
##                    ACF1 Theil's U
## Training set 0.08742471        NA
## Test set     0.25644648 0.7557907

# ETS transformation
accuracy(stlf_boxcox_myts.train, myts.test)

##                      ME      RMSE       MAE        MPE      MAPE      MASE
## Training set  -0.029466  2.993223  2.212575  -0.107248  3.677062 0.4517952
## Test set     -10.727124 16.213436 11.932970 -10.811284 11.844987 2.4366439
##                    ACF1 Theil's U
## Training set 0.08742471        NA
## Test set     0.25644648 0.7557907

Obviously, Accuracy of RMSE of Box-Cox-STLF transformation outperforms ETS transformation. In any case, we see lower RMSE and not much better than the previous forcasting models.