Question 7.1

Consider the pigs series — the number of pigs slaughtered in Victoria each month.

a.

Use the ses() function in R to find the optimal values of α and \(ℓ_0\), and generate forecasts for the next four months.

fc<-ses(pigs, h=4)
fc$model
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665
#Forecast
fc
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8
autoplot(fc)+ autolayer(fitted(fc),series="Fitted")+xlab('Year')+
  ylab('Number of pigs slaughtered')

The optimal values of α is0.2971 and \(ℓ_0\)=77260.06

b.

Compute a 95% prediction interval for the first forecast using y ± 1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

s<-sd(fc$residuals)
hi<-fc$mean+1.96*s
hi
##           Sep      Oct      Nov      Dec
## 1995 118952.8 118952.8 118952.8 118952.8
lo<-fc$mean-1.96*s
lo
##           Sep      Oct      Nov      Dec
## 1995 78679.97 78679.97 78679.97 78679.97

The higher bound of the 95% prediction is 118952.8 and lower bound is 78679.97. Compare to the range of Hi95 and Lo 95 which produced by R, my calculation rang seems a little bit narrow.

Question 7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

a.

Plot the series and discuss the main features of the data.

autoplot(books)+ggtitle("Sales of paperback and hardcover books")

The graph show the number of paperback and hardcover sales each day. We notice an upward sales for both of books. Hardcover sale have a higher sales number compare to paperback book at the end of the month. To better observe any features, we can use decomposition.

books2 <- ts(books, frequency=7)
autoplot(decompose(books2[, 1]))+ggtitle("Decompostion of additive time series for Paperback") 

autoplot(decompose(books2[, 2]))+ggtitle("Decompostion of additive time series for Hardcover") 

The upward trend was verified in the decomposition trend graph. NO seasonality for both books and there are some outliers in the beginning of the month.

b.

Use the ses() function to forecast each series, and plot the forecasts.

bk1<-ses(books[,1], h=4)
bk1
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1097 162.4882 251.7311 138.8670 275.3523
## 32       207.1097 161.8589 252.3604 137.9046 276.3147
## 33       207.1097 161.2382 252.9811 136.9554 277.2639
## 34       207.1097 160.6259 253.5935 136.0188 278.2005
bk2<-ses(books[,2], h=4)
bk2
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5601 197.2026 281.9176 174.7799 304.3403
## 32       239.5601 194.9788 284.1414 171.3788 307.7414
## 33       239.5601 192.8607 286.2595 168.1396 310.9806
## 34       239.5601 190.8347 288.2855 165.0410 314.0792
autoplot(books[,1],series = "Paperback")+autolayer(bk1,series = "Paperback")+
  autolayer(books[,2],series = "Hardcover")+autolayer(bk2,series = "Hardcover", PI = FALSE)+ylab('Sales')

c.

Compute the RMSE values for the training data in each case.

accuracy(bk1)[2]
## [1] 33.63769
accuracy(bk2)[2]
## [1] 31.93101

The RMSE for Paperback book and hardcover book is 33.63769 and 31.93101, respectively.

Question 7.6

We will continue with the daily sales of paperback and hardcover books in data set books

a.

Apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

hl1<-holt(books[,1],h=4)
hl2<-holt(books[,2],h=4)
hl1
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735
hl2
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334
autoplot(books[,1],series ="paperback")+autolayer(books[,2],series ="hardcover")+
  autolayer(hl1,series ="paperback" )+autolayer(hl2,series ="hardcover",PI=FALSE )

b.

Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets.

RMSE<-data.frame(accuracy(hl1)[2],accuracy(hl2)[2],accuracy(bk1)[2],accuracy(bk2)[2])
colnames(RMSE)<-c("Paperback SES","Hardcover SES","Paperback Holt","Hardcover holt")
RMSE

Holt’s linear method extended simple exponential smoothing to allow the forecasting of data with trend, so the forecast function is no longer flat but trending.Simple exponential method doesn’t consider trend and it is more accurate if the data with no clear trend or seasonal pattern.

c.

Compare the forecasts for the two series using both methods. Which do you think is best?

Compare to Holt’s method, Simple exponential smoothing method has lower RMSE. As we discuss in previous question, Holt’s method include trend to the model and should be apply to the dataset with clear trend. At the same time, Holt’s method display a constant trend indefinitely into the future and tend to over-forecast. We should dampen the trend to a flat line and it might have a better RMSE.

d.

Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and holt.

pre_sesP<-data.frame(bk1)[1,4:5]
pre_sesH<-data.frame(bk2)[1,4:5]
pre_HoltP<-data.frame(hl1)[1,4:5]
pre_HoltH<-data.frame(hl2)[1,4:5]

SES_Paper<-c(pre_sesP[1],pre_sesP[2])
df<-data.frame(SES_Paper)
df[nrow(df) + 1,] = c(pre_sesH[1],pre_sesH[2])
df[nrow(df) + 1,] = c(pre_HoltP[1],pre_HoltP[2])
df[nrow(df) + 1,] = c(pre_HoltH[1],pre_HoltH[2])
rownames(df)<-c("SES_Paper","SES_Hardcover","Holt_Paper","Holt_Hardcover")
colnames(df)<-c("Lower Limit","Upper Limit")
print(df)
##                Lower Limit Upper Limit
## SES_Paper         138.8670    275.3523
## SES_Hardcover     174.7799    304.3403
## Holt_Paper        143.9130    275.0205
## Holt_Hardcover    192.9222    307.4256

Simple exponential smoothing method have a lower limit for both the Paperback and Hardcover book. For the upper bound, Simple exponential also have a narrower limit, comparing to Holt’s method.

Question 7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900–1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts.

[Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.]

Which model gives the best RMSE?

egg1<-holt(eggs,h=100)
egg2<-holt(eggs,damped=TRUE,h=100)
egg3<-holt(eggs,lambda = BoxCox.lambda(eggs),h=100)
egg4<-holt(eggs,lambda = BoxCox.lambda(eggs),damped=TRUE,h=100)

autoplot(eggs)+autolayer(egg1, series="Holt", PI=FALSE)+
  autolayer(egg2,series="Holt Dampened",PI=FALSE)+
  autolayer(egg3,series="Holt Box-Cox",PI=FALSE)+ 
  autolayer(egg4,series="Holt Box-Cos Dampened",PI=FALSE)+
  xlab("Year")+ylab("price of dozen eggs")+ggtitle("Price of dozen eggs in US, 1900–1993")

df_RMSE<-data.frame(accuracy(egg1)[2],accuracy(egg2)[2],accuracy(egg3)[2],accuracy(egg4)[2])
colnames(df_RMSE)<-c("Holt","Holt Dampened","Holt Box-Cox","Holt Box-Cos Dampened")
rownames(df_RMSE)<-"RMSE"
df_RMSE

Holt method without the Dampened trend predict the price decrease indefinitely, which means the price of eggs will be negative if the price keep going down. It is unrealistic to expect the price is negative. Holt method and Holt Box-Cox Dampened are similar and predict an almost flat change in the following 100 years. Bolt Box-Cox predict the downward trend of the price and looks better than the other prediction methods. Holt Box-Cox without dampened trend give the smallest RMSE, 26.39376.

Question 7.8

Recall your retail time series data (from Exercise 3 in Section 2.10).

retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349873A"],
  frequency=12, start=c(1982,4))
autoplot(myts)

a.

Why is multiplicative seasonality necessary for this series?

autoplot(myts)

When we check the plot, we notice that the seasonality are not roughly constant through the series. The seasonal variations are changing proportional to the level of the series and change dramatically after year 2000.

b.

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

rs1<-hw(myts,seasonal = "multiplicative",h=100)
rs2<-hw(myts,seasonal = "multiplicative",damped=TRUE,h=100)
autoplot(myts)+autolayer(rs1,series="multiplicative",PI=FALSE)+autolayer(rs2,series="multiplicative with Damped",PI=FALSE)

c.

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

df_book<-data.frame(accuracy(rs1)[2],accuracy(rs2)[2])
colnames(df_book)<-c("multiplicative","multiplicative with Damped")
rownames(df_RMSE)<-"RMSE"
df_book

The RMSE of two methods are very similar.For this case, multiplicative without damped method gives the best RMSE.Damped method is trying to prevent over-forecast done by the multiplicative model and I will consider it first when I apply the multiplicative model.

d.

Check that the residuals from the best method look like white noise.

checkresiduals(rs2)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 42.932, df = 7, p-value = 3.437e-07
## 
## Model df: 17.   Total lags used: 24

We can detect some outliers outside the blue dot line and those residuals are not white noise.

e.

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 8 in Section 3.7?

myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)


fc <- snaive(myts.train)
accuracy(fc,myts.test)
##                     ME     RMSE      MAE       MPE      MAPE     MASE
## Training set  7.772973 20.24576 15.95676  4.702754  8.109777 1.000000
## Test set     55.300000 71.44309 55.78333 14.900996 15.082019 3.495907
##                   ACF1 Theil's U
## Training set 0.7385090        NA
## Test set     0.5315239  1.297866
fc2<-hw(myts.train,seasonal = "multiplicative",damped=TRUE)
accuracy(fc2,myts.test)
##                      ME      RMSE      MAE        MPE      MAPE      MASE
## Training set  0.4556121  8.681456  6.24903  0.2040939  3.151257 0.3916228
## Test set     67.4739545 81.946499 67.47395 18.7005373 18.700537 4.2285507
##                     ACF1 Theil's U
## Training set -0.01331859        NA
## Test set      0.42718471  1.526275

The RMSE of test set using seasonal naïve approach is 71.44309, but RMSE of test set using Holt-Winters multiplicative with damped method is 81.946499. This Holt-Winters method can’t beat seasonal naïve approach.

Question 7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

#Box-Cox transformed
lamda<-BoxCox.lambda(myts.train)
myts.train_trans<-BoxCox(myts.train,lamda)

#STL decomposition
mstl(myts.train_trans,lambda = lamda) %>% autoplot()

#Seasonally adjusted & ETS
fit.ets <- myts.train %>% 
  stlm( s.window = 13, robust = TRUE, method = "ets", lambda = lamda ) %>%
  forecast( h = 36, lambda = lamda )
## Warning in InvBoxCox(fcast$mean, lambda, biasadj, fcast): biasadj
## information not found, defaulting to FALSE.
fc3<-forecast(fit.ets,h=36)
autoplot(myts.train,series = "train",PI=FALSE)+
  autolayer(fc3,series = "ETS", PI=FALSE)+
  autolayer(myts.test,series = "Test", PI=FALSE)
## Warning: Ignoring unknown parameters: PI
## Warning: Ignoring unknown parameters: PI

accuracy(fc3,myts.test)
##                      ME      RMSE       MAE        MPE      MAPE      MASE
## Training set -0.6782982  8.583559  5.918078 -0.3254076  2.913104 0.3708823
## Test set     82.1015276 98.384220 82.101528 21.0189982 21.018998 5.1452516
##                    ACF1 Theil's U
## Training set 0.02704667        NA
## Test set     0.52161725  1.679783
accuracy(fc3,myts.test)
##                      ME      RMSE       MAE        MPE      MAPE      MASE
## Training set -0.6782982  8.583559  5.918078 -0.3254076  2.913104 0.3708823
## Test set     82.1015276 98.384220 82.101528 21.0189982 21.018998 5.1452516
##                    ACF1 Theil's U
## Training set 0.02704667        NA
## Test set     0.52161725  1.679783

RMSE for test data is 98.384220 after all the adjustments, which is higher than seasonal naïve approach and Holt-Winters multiplicative with damped method. When we check the RMSE of training set, 8.583559 give the lowest number for Training dataset. The model is a good for training but not for test.