DATA 624: Home Work 05

Libraries
Exercise 7.1
Exercise 7.5
Exercise 7.6
Exercise 7.7
Exercise 7.8
Excercise 7.9

Libraries

library(tidyverse)
library(fpp2)
library(plotly)

Exercise 7.1

Consider the pigs series — the number of pigs slaughtered in Victoria each month.

a)Use the ses() function in R to find the optimal values of \(\alpha\) and \({ l }_{ 0 }\), and generate forecasts for the next four months.

Let’s look at the data first:

glimpse(pigs)

##  Time-Series [1:188] from 1980 to 1996: 76378 71947 33873 96428 105084 ...

summary(pigs)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   33873   79080   91662   90640  101493  120184

According to r documentaion https://www.rdocumentation.org/packages/forecast/versions/8.13/topics/ses
ses() returns forecasts and other information for exponential smoothing forecasts applied to y. here h is the number of periods for forecasting.

Pigs_Ses <- ses(pigs, h = 4)
Pigs_Ses

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

Pigs_Ses$model

## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665

We can see from above \(\alpha\) = 0.2971 and \({ l }_{ 0 }\) = 77260.0561

b)Compute a 95% prediction interval for the first forecast using \(\overset { \wedge }{ y }\) ± 1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

Lower intervals using formula given

#Lower
s <- sd(Pigs_Ses$residuals)
Pigs_Ses$mean[1]-(1.96*s)

## [1] 78679.97

Higher interval using formula given

#Higher
s <- sd(Pigs_Ses$residuals)
Pigs_Ses$mean[1]+(1.96*s)

## [1] 118952.8

Let’s compute intervals using R

ses(pigs, h = 4, level = 95)$lower[1]

## [1] 78611.97

ses(pigs, h = 4, level = 95)$upper[1]

## [1] 119020.8

We can see from above that lower 95% interval computed by formula is slightly higher than computed by R and the higher 95% interval computed by formula is slightly lower than computed by R.

Exercise 7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

a)Plot the series and discuss the main features of the data.

Let’s look at the data first:

glimpse(books)

##  Time-Series [1:30, 1:2] from 1 to 30: 199 172 111 209 161 119 195 195 131 183 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr [1:2] "Paperback" "Hardcover"

summary(books)

##    Paperback       Hardcover    
##  Min.   :111.0   Min.   :128.0  
##  1st Qu.:167.2   1st Qu.:170.5  
##  Median :189.0   Median :200.5  
##  Mean   :186.4   Mean   :198.8  
##  3rd Qu.:207.2   3rd Qu.:222.0  
##  Max.   :247.0   Max.   :283.0

sum(books[, 'Hardcover'])

## [1] 5965

sum(books[, 'Paperback'])

## [1] 5592

p <- autoplot(books)+ggtitle("Daily Sales of Books")+xlab("Day")+ylab("Books Sold")
ggplotly(p)

We can see that average (mean) hardcover books sold is more than paperback book sold
Total hardcover books sold (5965) is more than paperback (5592)
Both paperback and hardcover has a positive upward trend
Possibly, if we had a much longer series, we would see that this upward trend is actually part of a long cycle, but when viewed over only 30 days it appears to be a trend.
We are not seeing any seanality as there are no pattern in fixed or known frequency

b)Use the ses() function to forecast each series, and plot the forecasts.

ses_paperback <- ses(books[,"Paperback"],h=4)
summary(ses_paperback)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = books[, "Paperback"], h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.1685 
## 
##   Initial states:
##     l = 170.8271 
## 
##   sigma:  34.8183
## 
##      AIC     AICc      BIC 
## 318.9747 319.8978 323.1783 
## 
## Error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE
## Training set 7.175981 33.63769 27.8431 0.4736071 15.57784 0.7021303
##                    ACF1
## Training set -0.2117522
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1097 162.4882 251.7311 138.8670 275.3523
## 32       207.1097 161.8589 252.3604 137.9046 276.3147
## 33       207.1097 161.2382 252.9811 136.9554 277.2639
## 34       207.1097 160.6259 253.5935 136.0188 278.2005

ses_hardcover <- ses(books[,"Hardcover"],h=4)
summary(ses_hardcover)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = books[, "Hardcover"], h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.3283 
## 
##   Initial states:
##     l = 149.2861 
## 
##   sigma:  33.0517
## 
##      AIC     AICc      BIC 
## 315.8506 316.7737 320.0542 
## 
## Error measures:
##                    ME     RMSE      MAE      MPE     MAPE      MASE
## Training set 9.166735 31.93101 26.77319 2.636189 13.39487 0.7987887
##                    ACF1
## Training set -0.1417763
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5601 197.2026 281.9176 174.7799 304.3403
## 32       239.5601 194.9788 284.1414 171.3788 307.7414
## 33       239.5601 192.8607 286.2595 168.1396 310.9806
## 34       239.5601 190.8347 288.2855 165.0410 314.0792

autoplot(books[, "Paperback"], series = "Paperback") +
  autolayer(ses_paperback, series = "Paperback") +
  autolayer(books[, "Hardcover"], series = "Hardcover") +
  autolayer(ses_hardcover, series = "Hardcover", PI = FALSE) +ggtitle("Daily Sales of Books")+xlab("Day")+ylab("Books Sold")

c)Compute the RMSE values for the training data in each case.

round(accuracy(ses_paperback), 2)

##                ME  RMSE   MAE  MPE  MAPE MASE  ACF1
## Training set 7.18 33.64 27.84 0.47 15.58  0.7 -0.21

round(accuracy(ses_hardcover), 2)

##                ME  RMSE   MAE  MPE  MAPE MASE  ACF1
## Training set 9.17 31.93 26.77 2.64 13.39  0.8 -0.14

Paperback has RMSE of 33.64 while hardcover has 31.93 for simple exponensial smoothing

Exercise 7.6

We will continue with the daily sales of paperback and hardcover books in data set books.
a)Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

holt_paperback <- holt(books[,"Paperback"], h=4)
summary(holt_paperback)

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = books[, "Paperback"], h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 170.699 
##     b = 1.2621 
## 
##   sigma:  33.4464
## 
##      AIC     AICc      BIC 
## 318.3396 320.8396 325.3456 
## 
## Error measures:
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
##                    ACF1
## Training set -0.1750792
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735

holt_hardcover <- holt(books[,"Hardcover"], h=4)
summary(holt_hardcover)

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = books[, "Hardcover"], h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 147.7935 
##     b = 3.303 
## 
##   sigma:  29.2106
## 
##      AIC     AICc      BIC 
## 310.2148 312.7148 317.2208 
## 
## Error measures:
##                      ME     RMSE      MAE       MPE    MAPE      MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
##                     ACF1
## Training set -0.03245186
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334

autoplot(books[, "Paperback"], series = "Paperback") +
  autolayer(holt_paperback, series = "Paperback") +
  autolayer(books[, "Hardcover"], series = "Hardcover") +
  autolayer(holt_hardcover, series = "Hardcover", PI = FALSE) +ggtitle("Daily Sales of Books")+xlab("Day")+ylab("Books Sold")

b)Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

round(accuracy(holt_paperback),2)

##                 ME  RMSE   MAE   MPE  MAPE MASE  ACF1
## Training set -3.72 31.14 26.18 -5.51 15.58 0.66 -0.18

round(accuracy(holt_hardcover),2)

##                 ME  RMSE   MAE   MPE  MAPE MASE  ACF1
## Training set -0.14 27.19 23.16 -2.11 12.16 0.69 -0.03

Paperback has RMSE of 33.64 for simple exponential smoothing method while Holt’s method has 31.14
hardcover has RMSE of 31.93 for simple exponential smoothing method while Holt’s method has 27.19
So, we see lower RMSE for both cases when Holt’s method is used
Holt’s method is suitable for data with trend. Holt’s method did better forcasting book time seties as it has clear upward trend.

c)Compare the forecasts for the two series using both methods. Which do you think is best?

autoplot(books[,"Paperback"]) +
  autolayer(holt_paperback, series="Holt's method", PI=FALSE) +
  autolayer(ses_paperback, series="SES method", PI=FALSE) +
  ggtitle("Daily Sales of Paperback books") + xlab("Day") +
  ylab("Books Sold") +
  guides(colour=guide_legend(title="Forecast"))

autoplot(books[,"Hardcover"]) +
  autolayer(holt_hardcover, series="Holt's method", PI=FALSE) +
  autolayer(ses_hardcover, series="SES method", PI=FALSE) +
  ggtitle("Daily Sales of hardcover books") + xlab("Day") +
  ylab("Books Sold") +
  guides(colour=guide_legend(title="Forecast"))

Holt’s method is best for book time series as the it includes trend while computing forcast while simple exponential smoothing produce a constant forcast. Also, Holt’s method produces lower RMSE for both paperback and harcover series.

d)Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

Paperback

s <- round(accuracy(holt_paperback), 2)[2] 

cat( "RMSE:",
holt_paperback$mean[1]-(1.96*s), "to", 
holt_paperback$mean[1]+(1.96*s))

## RMSE: 148.4324 to 270.5012

cat("\nSES:",
  ses(books[, "Paperback"], level = 95, h = 4)$lower[1], "to",
  ses(books[, "Paperback"], level = 95, h = 4)$upper[1])

## 
## SES: 138.867 to 275.3523

cat("\nHolt:",
  holt(books[, "Paperback"], level = 95, h = 4)$lower[1], "to",
  holt(books[, "Paperback"], level = 95, h = 4)$upper[1])

## 
## Holt: 143.913 to 275.0205

Hardcover

s <- round(accuracy(holt_hardcover), 2)[2] 

cat( "RMSE:",
holt_hardcover$mean[1]-(1.96*s), "to", 
holt_hardcover$mean[1]+(1.96*s))

## RMSE: 196.8815 to 303.4663

cat("\nSES:",
  ses(books[, "Hardcover"], level = 95, h = 4)$lower[1], "to",
  ses(books[, "Hardcover"], level = 95, h = 4)$upper[1])

## 
## SES: 174.7799 to 304.3403

cat("\nHolt:",
  holt(books[, "Hardcover"], level = 95, h = 4)$lower[1], "to",
  holt(books[, "Hardcover"], level = 95, h = 4)$upper[1])

## 
## Holt: 192.9222 to 307.4256

Exercise 7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.

[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Which model gives the best RMSE?

Let’s look at the data first:

glimpse(eggs)

##  Time-Series [1:94] from 1900 to 1993: 277 315 315 321 315 ...

summary(eggs)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   62.27  148.87  209.15  206.15  276.71  358.78

plot_eggs <- autoplot(eggs)
ggplotly(plot_eggs)

We can see a obvious downward trend of price of a dozen eggs

According to r documentaion https://www.rdocumentation.org/packages/aTSA/versions/3.1.2/topics/Holt
holt() Performs Holt’s two-parameter exponential smoothing for linear trend or damped trend.

Using holt() without any option

plot_eggs +
autolayer(holt(eggs,h = 100))

We can see only hot() produce unrealistic forecast (negative) as price of a dozen eggs can not be negative.

Using holt() with damped = True

plot_eggs +
autolayer(holt(eggs,h = 100, damped = T))

Forcasted price is above zero but didn’t include any downward trend which is obvious from data.

Using holt() with Box-Cox transformation

plot_eggs +
autolayer(holt(eggs,h = 100, lambda = BoxCox.lambda(eggs)))

Forcasted price is above zero and included downward trend which is obvious from data.

Using holt() with Box-Cox transformation and damped

plot_eggs +
autolayer(holt(eggs,h = 100, damped = T, lambda = BoxCox.lambda(eggs)))

Forcasted price is above zero but didn’t include any downward trend which is obvious from data, also lower end of prediction interval is close to zero.

Let’s plot all method at once

holt <- holt(eggs, h=100)
holt_damped <- holt(eggs, damped=TRUE, h=100)
holt_boxcox <- holt(eggs, lambda = BoxCox.lambda(eggs), h=100)
holt_damped_boxcox <- holt(eggs, lambda = BoxCox.lambda(eggs), h=100, damped=TRUE)

autoplot(eggs) +
  autolayer(holt, series="Holt's method", PI=FALSE) +
  autolayer(holt_damped, series="Damped Holt's method", PI=FALSE) +
  autolayer(holt_boxcox, series="Box-Cox", PI=FALSE)+
  autolayer(holt_damped_boxcox, series="Box-Cox-Damped", PI=FALSE)+
  ggtitle("Forecasts from Holt's method") + xlab("Year") +
  ylab("Eggs") +
  guides(colour=guide_legend(title="Forecast"))

Let’s see Which model gives the best RMSE

cat(" RMSE of Holt's method:",accuracy(holt)[2],"\n","RMSE of Holt's damped method:",accuracy(holt_damped)[2],"\n","RMSE of Holt's method with box-cox:" ,accuracy(holt_boxcox)[2],"\n","RMSE of Holt's damped method with box-cox:" ,accuracy(holt_damped_boxcox)[2])

##  RMSE of Holt's method: 26.58219 
##  RMSE of Holt's damped method: 26.54019 
##  RMSE of Holt's method with box-cox: 26.39376 
##  RMSE of Holt's damped method with box-cox: 26.53321

Based on the accuracy scores, the holt’s method with box-cox transformation gives the best RMSE.

Exercise 7.8

Recall your retail time series data (from Exercise 3 in Section 2.10)

a)Why is multiplicative seasonality necessary for this series?

Let’s look at the retail data from Exercise 3 in Section 2.10

retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349873A"], frequency=12, start=c(1982,4)) 

ggplotly(autoplot(myts))

hw_additive <- hw(myts,seasonal="additive", h=100)
hw_multiplicative <- hw(myts,seasonal="multiplicative", h= 100)
autoplot(myts) +
  autolayer(hw_additive, series="HW additive forecasts", PI=FALSE) +
  autolayer(hw_multiplicative, series="HW multiplicative forecasts",
    PI=FALSE) +
  xlab("Time") +
  ylab("myts") +
  ggtitle("Retail Sales") +
  guides(colour=guide_legend(title="Forecast"))

In this case, the method with multiplicative seasonality fits the data best. This was to be expected, as the time plot shows that the seasonal variation in the data increases as the level of the series increases. This is also reflected in the two sets of forecasts; the forecasts generated by the method with the multiplicative seasonality display larger and increasing seasonal variation as the level of the forecasts increases compared to the forecasts generated by the method with additive seasonality.

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

hw_multiplicative <- hw(myts,seasonal="multiplicative", h=100)
hw_multiplicative_damped <- hw(myts,seasonal="multiplicative", damped = TRUE, h=100)
autoplot(myts) +
  autolayer(hw_multiplicative, series="HW multiplicative forecasts", PI=FALSE) +
  autolayer(hw_multiplicative_damped, series="HW multiplicative damped forecasts",
    PI=FALSE) +
  xlab("Time") +
  ylab("myts") +
  ggtitle("Retail Sales") +
  guides(colour=guide_legend(title="Forecast"))

C)Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

hw_multiplicative_one_step <- hw(myts,seasonal="multiplicative", h=1)
accuracy(hw_multiplicative_one_step)

##                     ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 0.1170648 13.29378 8.991856 -0.1217735 3.918351 0.4748948
##                    ACF1
## Training set 0.08635577

hw_multiplicative_damped_one_step <- hw(myts,seasonal="multiplicative", damped = TRUE, h=1)
accuracy(hw_multiplicative_damped_one_step)

##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 1.414869 13.30494 9.042151 0.6105987 3.959617 0.4775511
##                    ACF1
## Training set 0.04077895

Holt winter multiplicative method gives slightly better RMSE but keep increasing when plot for 100 steps. This might result over forcating. So, i think Holt winter multiplicative method with damped will be suitable in this case.

d)Check that the residuals from the best method look like white noise.

checkresiduals(hw_multiplicative_damped_one_step)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 42.932, df = 7, p-value = 3.437e-07
## 
## Model df: 17.   Total lags used: 24

Looking at the residuals and p-value it doesn’t seem like white noise.

e)Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

myts_train <- window(myts, end=c(2010,12))
myts_test <- window(myts, start=2011)

naive <- snaive(myts_train)
accuracy(naive, myts_test)

##                     ME     RMSE      MAE       MPE      MAPE     MASE
## Training set  7.772973 20.24576 15.95676  4.702754  8.109777 1.000000
## Test set     55.300000 71.44309 55.78333 14.900996 15.082019 3.495907
##                   ACF1 Theil's U
## Training set 0.7385090        NA
## Test set     0.5315239  1.297866

hw_damped  <- hw(myts_train, seasonal="multiplicative",damped = T)
accuracy(hw_damped, myts_test)

##                      ME      RMSE      MAE        MPE      MAPE      MASE
## Training set  0.4556121  8.681456  6.24903  0.2040939  3.151257 0.3916228
## Test set     67.4739545 81.946499 67.47395 18.7005373 18.700537 4.2285507
##                     ACF1 Theil's U
## Training set -0.01331859        NA
## Test set      0.42718471  1.526275

hw_multiplicative  <- hw(myts_train, seasonal="multiplicative")
accuracy(hw_multiplicative, myts_test)

##                       ME      RMSE       MAE          MPE      MAPE
## Training set  0.03021223  9.107356  6.553533  0.001995484  3.293399
## Test set     55.33771444 70.116586 55.337714 15.173207794 15.173208
##                   MASE       ACF1 Theil's U
## Training set 0.4107058 0.02752875        NA
## Test set     3.4679801 0.35287516   1.30502

When i tried with holt winter multiplicative with damped i couldn’t beat the naive method. But i was able to beat naive method using holt winter multiplicative method.

Excercise 7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

STL

stl_boxcox_myts_train <- stlf(myts_train, lambda=BoxCox.lambda(myts_train))

autoplot(myts_train, series = "train") +
  autolayer(stl_boxcox_myts_train, series = 'STL')

accuracy(stl_boxcox_myts_train, myts_test)

##                     ME      RMSE       MAE        MPE      MAPE      MASE
## Training set -0.646515  8.147517  5.855529 -0.3001347  2.885877 0.3669624
## Test set     56.462144 71.117890 56.462144 15.5079003 15.507900 3.5384474
##                    ACF1 Theil's U
## Training set 0.02389584        NA
## Test set     0.35938690  1.322933

ETS

ets_myts_train <- ets(seasadj(decompose(myts_train,"multiplicative")))
summary(ets_myts_train)

## ETS(M,A,N) 
## 
## Call:
##  ets(y = seasadj(decompose(myts_train, "multiplicative"))) 
## 
##   Smoothing parameters:
##     alpha = 0.5989 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 65.5579 
##     b = 0.7729 
## 
##   sigma:  0.0403
## 
##      AIC     AICc      BIC 
## 3412.805 3412.982 3432.023 
## 
## Training set error measures:
##                      ME     RMSE      MAE        MPE     MAPE      MASE
## Training set -0.2829213 8.551268 6.201346 -0.2055279 3.091856 0.3886063
##                    ACF1
## Training set 0.02876641

autoplot(myts_train, series = 'Train') + 
  autolayer(forecast(ets_myts_train, h = 100, PI=F), series = "Forecast")

accuracy(forecast(ets_myts_train), myts_test)

##                      ME      RMSE       MAE        MPE      MAPE      MASE
## Training set -0.2829213  8.551268  6.201346 -0.2055279  3.091856 0.3886063
## Test set     54.2301189 89.745380 56.566739 13.0296897 13.908396 3.5447449
##                    ACF1 Theil's U
## Training set 0.02876641        NA
## Test set     0.52413835  1.475004

SLT performs better than ETS with lower RMSE, But holt winter multiplicative method performs best when fit with test data.