Exponential Smoothing 2

##setwd ('T://00-624 HH Predictive Analytics//HMWk3 Exponential Smoothing')

getwd()

## [1] "T:/00-624 HH Predictive Analytics/HMWk5 Exponential Smoothing"

HA 7.1

Consider the pigs series - the number of pigs slaughtered in Victoria each month. 7.1 a) Use the ses() function in R to find the optimal values of \(\alpha\) and lo, and generate forecasts for the next four months.

library(fpp2)

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

## -- Attaching packages ---------------------------------- fpp2 2.4 --

## v ggplot2   3.3.0     v fma       2.4  
## v forecast  8.12      v expsmooth 2.3

##

library(ggplot2)
fc_pigs <- ses(pigs, h=4)
fc_pigs$model

## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665

fc_pigs

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

Using the SES function, setting the next 4 months of forcast, we found that the optimal value of alpha to be 0.2971, and the lo is = 77260

HA7.1 b) Compute a 95% prediction interval for the first forecast using y?1.96 s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

yhigh = 98816.41 + 1.96*(sd(fc_pigs$residuals)) 
ylow = 98816.41 - 1.96*(sd(fc_pigs$residuals)) 

ylow

## [1] 78679.97

yhigh

## [1] 118952.8

Based on my calculation, the 95% confidence interval is between 78679.97-118,952.8. Based on the R’s calculation, the 95% confidence interval is between 78611.97-119,020.8.

HA 7.5

Data set books contains the daily sales of paperback and hardcover books at the same store. The task is to forecast the next four days’ sales for paperback and hardcover books.

7.5 a) Plot the series and discuss the main features of the data.

autoplot(books) + ylab("Number of Books Sales at the Same Store") + xlab("Day") + ggtitle("Daily Book Sales")

As days goes by from day 1 to day 30, the numbers of daily book sale seem to have an upward trend, in both paperback book sale and hardvover booksale. This pattern seem quite obvious.

It seems that during the first 10 days, there are more paperback book sold than the hard cover books. during day 10 to day 30, however, there seem to be more hard cover books sold than paperback boods. But these pattens are not very obvious, and could be due to random noise.

7.5 b) Use the ses() function to forecast each series, and plot the forecasts.

pbBooks = books[,1]
hcBooks = books[,2]
ses_paper_books <- ses(pbBooks, h=4)
ses_hc_books <- ses(hcBooks, h=4)
autoplot(ses_paper_books) +
  autolayer(fitted(ses_paper_books), series="SES Model") +
  ylab("Number of Paperback Books Sold at the Same Store") + xlab("Day") + ggtitle("Daily Paperback Book Sales")

First, the confidence interval for the prediction is noticably very large, ranging from 80 to about 170, which indicates that this model is not extremely accurate.

Second, the trend line seems stable during the month, although there is somewhat a dip in the middle of the month (from day 10-15). And it seems that people tend to buy more books towards the end of the month (day 25-30), and they tend to buy less books during the beginning of the month (day 1-10). This is somewhat in agreement with the model prior.

autoplot(ses_hc_books) +
  autolayer(fitted(ses_hc_books), series="SES Model") +
  ylab("Number of Harcover Books Sold at the Same Store") + xlab("Day") + ggtitle("Daily Hardcover Books Sales")

First, like before, we have noticed that the 95% confidence interval for the prediction on day 31 is very wide, ranging from about 175-305 books sold Second, unlike before, the daily hardcover books sale clearly shows an upward pattern, from day 1 going up to day 30, and is consistently going up almost every day.

7.5 c) Compute the RMSE values for the training data in each case.

pb_e <- tsCV(pbBooks, ses, h=4)
rmse_pb <- sqrt(mean(pb_e^2, na.rm=TRUE))
rmse_pb

## [1] 38.40918

The RMSE for the simple exponential smoothing model for paperback books is 38.4. The model’s prediction is off on average by about 38 paperback books sold each day.

hc_e <- tsCV(hcBooks, ses, h=4)
rmse_hc <- sqrt(mean(hc_e^2, na.rm=TRUE))
rmse_hc

## [1] 38.74066

The RMSE for the simple exponential smoothing model for hardcover books is 38.7. The model’s prediction is off on average by about 39 hardcover books sold each day.

HA 7.6

a)Now apply Holt’s linear method to the paperback and hardback series and compute four-day forecasts in each case.

fc_holt_pb <- holt(pbBooks, h=4)
fc_holt_hc <- holt(hcBooks, h=4)
autoplot(fc_holt_pb) +
  autolayer(fitted(fc_holt_pb), series="Holt Model", PI=FALSE) +
  autolayer(fitted(ses_paper_books), series="SES Model") +
  ylab("Number of Paperback Books Sales at the Same Store") + xlab("Day") + ggtitle("Daily Paperback Book Sales")

## Warning: Ignoring unknown parameters: PI

Here, the Holt model and SES model largely tells the same story. They are in agreement of each other throught the month, although there is little deviation in the middle of the month (day 10-20) when the SES model shows slight dip of sales, but Holt model seem to disregard that dip and still show us an upward trend regardelss of that dip in SES model. The Holt exponential model incorporates a trend component, which is a straght lne.
The Holt model for paperback book sales displays an increasing trend, which is not straght. The 95% confidence interval for the day 31 prediction is very wide, as I have noticed in above.

autoplot(fc_holt_hc) +
  autolayer(fitted(fc_holt_hc), series="Holt Model", PI=FALSE) +
  autolayer(fitted(ses_hc_books), series="SES Model") +
  ylab("Number of Hardcover Books Sales at the Same Store") + xlab("Day") + ggtitle("Daily Hardcover Book Sales")

## Warning: Ignoring unknown parameters: PI

For the hard cover book sales, both HOLT model and SES model clearly show a very obvious upward trend from beginning of the month till end of the month. The angle of the Holt Model is almost at 45 degrees, which tells us that this upward trend is indisputable.

The SES model line hover around the HOLT model line a little bit. The SES model line is above the Hlot Model line in the middle of the month (day 10-20), while it is below the Holt Model line in the beginning (1-10 day) and end (21-30 day) of the month.

b+c) Compare the RMSE measures of Holt’s method for the two series to those of simple exponential smoothing in the previous question. (Remember that Holt’s method is using one more parameter than SES.) Discuss the merits of the two forecasting methods for these data sets. Compare the forecasts for the two series using both methods. Which do you think is best?

pb_holt_e <- tsCV(pbBooks, holt, h=4)
rmse_holt_pb <- sqrt(mean(pb_holt_e^2, na.rm=TRUE))
rmse_holt_pb

## [1] 46.14786

The RMSE for paperback book sales is 46. This RMSE is higher than the RMSE for the simple exponential smoothing method, meaning that this model (.

hc_holt_e <- tsCV(hcBooks, holt, h=4)
rmse_holt_hc <- sqrt(mean(hc_holt_e^2, na.rm=TRUE))
rmse_holt_hc

## [1] 39.42814

The RMSE for hardcover book sales is 39. This is higher than the RMSE for the simple exponential smoothing method.

The root mean square errors are higher for the Holt model. The Holt model just follows the trend in the data. Even though the RMSE is higher, the 95% confidence interval for the Holt method is narrower and I believe it will do a better job in forecasting than the ses model.

HA7.6d) Calculate a 95% prediction interval for the first forecast for each series, using the RMSE values and assuming normal errors. Compare your intervals with those produced using ses and Holt.

Confidence Interval Paperback Books SES Model

ses_paper_books

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       207.1097 162.4882 251.7311 138.8670 275.3523
## 32       207.1097 161.8589 252.3604 137.9046 276.3147
## 33       207.1097 161.2382 252.9811 136.9554 277.2639
## 34       207.1097 160.6259 253.5935 136.0188 278.2005

ses_paper_books$mean[1]-1.96*rmse_pb

## [1] 131.8277

ses_paper_books$mean[1]+1.96*rmse_pb

## [1] 282.3917

The 95% prediction interval I calculated using the RMSE for paperback books with the ses method is from 132-282 books sold. The ses model’s 95% interval is between 139-275 books.

Confidence Interval Paperback Books Holt Model

fc_holt_pb

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735

fc_holt_pb$mean[1]-1.96*rmse_holt_pb

## [1] 119.017

fc_holt_pb$mean[1]+1.96*rmse_holt_pb

## [1] 299.9166

The 95% prediction interval I calculated using the RMSE for paperback books with the Holt method is from 119-299 books sold. The Holt model’s 95% interval is between 144-275 books.

Confidence Interval Hardcover Books SES Model

ses_hc_books

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       239.5601 197.2026 281.9176 174.7799 304.3403
## 32       239.5601 194.9788 284.1414 171.3788 307.7414
## 33       239.5601 192.8607 286.2595 168.1396 310.9806
## 34       239.5601 190.8347 288.2855 165.0410 314.0792

ses_hc_books$mean[1]-1.96*rmse_hc

## [1] 163.6284

ses_hc_books$mean[1]+1.96*rmse_hc

## [1] 315.4918

The 95% prediction interval I calculated using the RMSE for hardcover books with the ses method is from 164-315 books sold. The ses model’s 95% interval is between 175-304 books.

Confidence Interval Harcover Books Holt Model

fc_holt_hc

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334

fc_holt_hc$mean[1]-1.96*rmse_holt_hc

## [1] 172.8947

fc_holt_hc$mean[1]+1.96*rmse_holt_hc

## [1] 327.453

The 95% prediction interval I calculated using the RMSE for hardcover books with the Holt method is from 172-327 books sold. The Holt model’s 95% interval is between 193-307 books.

In each case, the 95% confidence interval using the RMSE was larger than the confidence interval predicted by the model. Confidence intervals should be calculated using the standard error, which is not equivalent to the RMSE.

HA 7.7

For this exercise use data set eggs, the price of a dozen eggs in the United States from 1900-1993. Experiment with the various options in the holt() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each argument is doing to the forecasts. [Hint: use h=100 when calling holt() so you can clearly see the differences between the various options when plotting the forecasts.] Which model gives the best RMSE?

fc_holt_eggs <- holt(eggs, h=100)
fc_holt_eggs_damped <- holt(eggs, damped=TRUE, h=100)
fc_holt_eggs_damped[["model"]]

## Damped Holt's method 
## 
## Call:
##  holt(y = eggs, h = 100, damped = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.8462 
##     beta  = 0.004 
##     phi   = 0.8 
## 
##   Initial states:
##     l = 276.9842 
##     b = 4.9966 
## 
##   sigma:  27.2755
## 
##      AIC     AICc      BIC 
## 1055.458 1056.423 1070.718

fc_holt_eggs_boxcox <- holt(eggs, lambda=0.5, h=100)


autoplot(eggs) +
  autolayer(fc_holt_eggs, series="Holt's method", PI=FALSE) +
  autolayer(fc_holt_eggs_damped, series="Damped Holt's method phi=0.8", PI=FALSE) +
  autolayer(fc_holt_eggs_boxcox, series="Box Cox", PI=FALSE) +
  ggtitle("Forecasts from Holt's method") + xlab("Year") +
  ylab("Price of a Dozen Eggs in the United States (dollars)") +
  guides(colour=guide_legend(title="Forecast"))

eggs_holt_error <- tsCV(eggs, holt, h=100)
rmse_holt_eggs <- sqrt(mean(eggs_holt_error^2, na.rm=TRUE))
rmse_holt_eggs

## [1] 277.5396

eggs_damped_holt_error <- tsCV(eggs, holt, damped=TRUE, h=100)
rmse_holt_eggs_damped <- sqrt(mean(eggs_damped_holt_error^2, na.rm=TRUE))
rmse_holt_eggs_damped

## [1] 278.5824

eggs_boxcox_error <- tsCV(eggs, holt, lambda=0.5, h=100)
rmse_boxcox_eggs<- sqrt(mean(eggs_boxcox_error^2, na.rm=TRUE))
rmse_boxcox_eggs

## [1] 119.066

It is very clear that the historical data maintains a downward trend as year goes by from 1900 to year 2000. The price from 1900 starts from around 210 and hovers around that price for about 20 years. Then it had a decrease again from 1920-1940, went up in 1940 or so, but decreased gradually with the clear downward trend ever since.

For the three transformation methods comparison, Holt’s method predicts the deepest downward trend. It is expected that there will be a -200 price decrease from year 2000 to 2100.

The Box Cos transformation predicts a relative modest decrease, estiamting a downward trend, with about -50 decrease from year 2000 till 2100.

HOwever, the Damped Holt’s method with phi=0.8 has a flat prediction, predicting no change in price from year 2000 till 2100.

What is noteworthy is that the root mean square error is significantly lower from the Box Cox transformation.

HA 7.8

Recall your retail time series data (from Exercise 3 in Section 2.10). a) Why is multiplicative seasonality necessary for this series?

retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349399C"],frequency=12, start=c(1982,4))
autoplot(myts) + ylab("Retail Clothing Sales") + ggtitle("New South Wales - Clothing Sales")

As we can see from the trend line, the seasonal variation is increasing as years go by. The variation is not stable year after year at all. Therefore, multiplicative seasonality is necessary because the seasonal variation changes.

HA7.8b) Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

retail_hw <- hw(myts,seasonal="multiplicative", h=10)
retail_hw_damped <- hw(myts,seasonal="multiplicative", damped=TRUE, h=10)
autoplot(myts) +
  autolayer(retail_hw, series="HW multiplicative forecasts", PI=FALSE) +
  autolayer(retail_hw_damped, series="Damped HW multiplicative forecasts", PI=FALSE) +
  xlab("Year") +
  ylab("Retail Clothing Sales") +
  ggtitle("New South Wales - Clothing Sales") +
  guides(colour=guide_legend(title="Forecast"))

HA7.8c) Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

retailhw_error <- tsCV(myts, hw, seasonal="multiplicative", h=1)
rmse_retailhw <- sqrt(mean(retailhw_error^2, na.rm=TRUE))
rmse_retailhw

## [1] 14.18924

retailhw_error_damped <- tsCV(myts, hw, seasonal="multiplicative", damped=TRUE, h=1)
rmse_retailhw_damped<- sqrt(mean(retailhw_error_damped^2, na.rm=TRUE))
rmse_retailhw_damped

## [1] 14.4676

The root mean square error is slightly lower for the Holt Winter method (14.18) that is not damped, compared with the just multiplicative forcast (14.46). But the difference is not very pronouced, therefore, I would not conclude that I prefer one over another just based on RSME.

7.8 d) Check that the residuals from the best method look like white noise.

checkresiduals(retailhw_error_damped )

## Warning in modeldf.default(object): Could not find appropriate degrees of
## freedom for this model.

The residuals clearly follow a normal distribution. Its cetered at 0, and not left skewed nor right skewed. The 95% confidence interval for the residual captures the right amount of residual data points too. So, yes, I would conclude that the resisual look like white noise from this method (multiplicative method).

7.8 e) Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naive approach from Exercise 8 in Section 3.7?

myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)
fc_naive <- snaive(myts.train)
accuracy(fc_naive,myts.test)

##                     ME     RMSE      MAE      MPE     MAPE     MASE      ACF1
## Training set  9.007207 21.13832 16.58859 4.224080 7.494415 1.000000 0.5277855
## Test set     10.362500 21.50499 18.99583 2.771495 5.493632 1.145115 0.7420700
##              Theil's U
## Training set        NA
## Test set     0.3223094

While training the odel to the end of 2010 and using the naive method, the RMSE of the test set is 21.5. Yes, the damped multiplicative method do beat the naive method, because its RMSE is about 14, much lower than the naive model

fc_hw <- hw(myts.train,seasonal="multiplicative")
accuracy(fc_hw,myts.test)

##                      ME     RMSE       MAE        MPE     MAPE      MASE
## Training set -0.4181467 12.62076  9.483714 -0.5602421 4.615519 0.5717011
## Test set     -5.6475034 14.47224 10.911929 -1.8801266 3.191901 0.6577973
##                    ACF1 Theil's U
## Training set 0.06436742        NA
## Test set     0.54835802 0.2212264

Uing the Holt’s winter model, the RMSE for the test set is 14.5, which is lower than the naive approach. So we can conclude that the Holt’s Winter model is a better predictor than the naive model.

HA 7.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

lambda_retail <- BoxCox.lambda(myts.train)
boxcox_retail <- BoxCox(myts.train,lambda_retail)
boxcox_retail_test <- BoxCox(myts.test,lambda_retail)
autoplot(BoxCox(myts.train,lambda_retail)) +  ggtitle("Box Cox Transformation of Retail Clothing Sales in New South Wales")

library(seasonal)
boxcox_ts = ts(boxcox_retail[1:138,1], start=c(1982,4), end=c(2010, 12), frequency = 12)
stl_retail <- stl(boxcox_ts, s.window="periodic", robust=TRUE)
autoplot(stl_retail)+  ggtitle("STL Decomposition of Box Cox Transformation of Retail Clothing Sales in New South Wales")

seasadj_retail <- seasadj(stl_retail)

fit <- ets(seasadj_retail, lambda=0)
fc<- forecast(fit)
fvar <- ((BoxCox(fc$upper,fit$lambda) -
    BoxCox(fc$lower,fit$lambda))/qnorm(0.975)/2)^2


accuracy(fvar,boxcox_retail_test)

##                ME     RMSE      MAE      MPE     MAPE      ACF1 Theil's U
## Test set 7.010702 7.014896 7.010702 99.98535 99.98535 0.2301583   27.1074

The RMSE is 7.0. This is lower than the RMSE of other models that was produced before, actually much lower (previously best RSME is at 14.4 for multiplicative model , and 21.5 for naive model). So I will confidently conclude that the seasonal adjusted model using box cox transformation is the best model so far, which do a good job to predict the retail clothing sales in New South Wales.