DATA624 HW5

Do exercises 8.1, 8.5, 8.6, 8.7, 8.8, 8.9 in Hyndman.

Ex 8.1

Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset. a. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and ℓ0, and generate forecasts for the next four months. b. Compute a 95% prediction interval for the first forecast using ^y±1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

library(fpp3)

#Part a
fit <- aus_livestock |>
  filter(Animal=='Pigs', State=='Victoria')|>
  model(ETS(Count ~ error("A") + trend("N") + season("N")))

report(fit)

## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3221247 
## 
##   Initial states:
##      l[0]
##  100646.6
## 
##   sigma^2:  87480760
## 
##      AIC     AICc      BIC 
## 13737.10 13737.14 13750.07

fc <- fit |>
   forecast(h = 4)

fc |>
  autoplot(aus_livestock) +
  labs(y="Count", title="Number of pigs slaughtered in Victoria") +
  guides(colour = "none")

#Part b
s<-sd(augment(fit)$.resid)

upper_limit_95<-fc$.mean[1]+(s*1.96)
lower_limit_95<-fc$.mean[1]-(s*1.96)

upper_limit_95

## [1] 113502.1

lower_limit_95

## [1] 76871.01

fc|>hilo()|>pull(-1)

## <hilo[4]>
## [1] [76854.79, 113518.3]95 [75927.17, 114445.9]95 [75042.22, 115330.9]95
## [4] [74194.54, 116178.6]95

\(\alpha\) = 0.3221247 while \(l_{0}\) = 100646.60. The computed interval is [76871.01, 113502.1] whilst the interval calculated by R is [76854.79, 113518.3]. They are very close.

Ex 8.5

Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

Plot the Exports series and discuss the main features of the data.

china_exports <- global_economy|>
  filter(Country=='China')

autoplot(china_exports, Exports)+
  labs(y='Annual Exports', title='China Exports')

There is no seasonal information and the general trend went up from 1960 to 2007 and then came down.

Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

fit2 <- china_exports |>
  model(ETS(Exports ~ error("A") + trend("N") + season("N")))

report(fit2)

## Series: Exports 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.9998999 
## 
##   Initial states:
##      l[0]
##  4.308241
## 
##   sigma^2:  3.7396
## 
##      AIC     AICc      BIC 
## 315.9713 316.4157 322.1526

fc2 <- fit2 |>
   forecast(h = 5)

fc2 |>
  autoplot(china_exports) +
  labs(y="Exports", title="China Exports with Forecast") +
  guides(colour = "none")

Compute the RMSE values for the training data.

fit2|>accuracy()

The RMSE is 1.9.

Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

compare<-china_exports|>
  model(
    ANN=ETS(Exports ~ error("A") + trend("N") + season("N")),
    AAN=ETS(Exports ~ error("A") + trend("A") + season("N"))
  )

accuracy(compare)

It is interesting to find out that RMSE of AAN is larger than ANN’s. It means ANN model is better than AAN. The trend element doesn’t matter and make the model worse.

Compare the forecasts from both methods. Which do you think is best?

compare |>
  forecast(h=5) |> 
  autoplot(china_exports, level=NULL) +
  labs(title="Forecast Comparison",
       subtitle = "China Exports")

The ANN model looks better than the AAN model.

Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

s2<-sd(augment(fit2)$.resid)

upper_limit_952<-fc2$.mean[1]+(s2*1.96)
lower_limit_952<-fc2$.mean[1]-(s2*1.96)

upper_limit_952

## [1] 23.47713

lower_limit_952

## [1] 16.03761

fc2|>hilo()|>pull(-1)

## <hilo[5]>
## [1] [15.96718, 23.54756]95 [14.39750, 25.11724]95 [13.19301, 26.32174]95
## [4] [12.17756, 27.33718]95 [11.28292, 28.23182]95

The calculated lower and upper limit is [16.03761, 23.47713] whilst R calculation is [15.96718, 23.54756]. The variance is 0.07 or 0.14 for the range, in which the calculated range is smaller.

Ex 8.6

Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

china_gdp <- global_economy|>
  filter(Country=='China')

#Obtain an optimal lambda for Box Cox
lambda <- china_gdp %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)

fit_gdp <- china_gdp |>
  model(
    `Holt's method`=ETS(GDP ~ error("A") + trend("A") + season("N")),
    `Damped Holt's method`=ETS(GDP ~ error("A") + trend("Ad", phi=0.9) + season("N")),
    `Holt's method w/boxcox`=ETS(box_cox(GDP,lambda) ~ error("A") + trend("A") + season("N")),
    `Damped Holt's method w/boxcox`=ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad", phi=0.9) + season("N")))

report(fit_gdp)

accuracy(fit_gdp)

fc_gdp <- fit_gdp |>
   forecast(h = 25)

fc_gdp |>
  autoplot(china_gdp, level=NULL) + 
  labs(y='GDP', title='China GDP with 25 years forecast') +
  guides(colour = guide_legend(title = 'Forecast'))

Holt’s method with box cox transformation over-forecasts and doesn’t make sense at all. The other three are more reasonable. The dampened Holt’s method is the closet to the reality.

Ex 8.7

Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

autoplot(aus_production, Gas)

fit_gas <- aus_production |>
  model(
    `Holt's method A`=ETS(Gas ~ error("A") + trend("A") + season("A")),
    `Damped Holt's method A`=ETS(Gas ~ error("A") + trend("Ad", phi=0.9) + season("A")),
    `Holt's method M`=ETS(Gas ~ error("M") + trend("A") + season("M")),
    `Damped Holt's method M`=ETS(Gas ~ error("M") + trend("Ad", phi=0.9) + season("M")),)

report(fit_gas)

accuracy(fit_gas)

fc_gas <- fit_gas |>
   forecast(h = 30)

fc_gas |>
  autoplot(aus_production, level=NULL) +
  labs(y='Gas', title='Australia Production') +
  guides(colour = guide_legend(title = 'Forecast'))

Visually, I don’t see much different between all four models. However, from AIC value and sigma2, the multiplicative seasonality perform better than the additive one. The RSME look similar though.

Ex 8.8

Recall your retail time series data (from Exercise 8 in Section 2.10).

set.seed(123456)

myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

myseries

Why is multiplicative seasonality necessary for this series? Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

fit_myseries <- myseries |>
  model(
    `Holt's method M`=ETS(Turnover ~ error("M") + trend("A") + season("M")),
    `Damped Holt's method M`=ETS(Turnover ~ error("M") + trend("Ad", phi=0.9) + season("M")),)

report(fit_myseries)

accuracy(fit_myseries)

fc_myseries <- fit_myseries |>
   forecast(h = 24)

fc_myseries |>
  autoplot(myseries, level=NULL) +
  labs(y='Turnover', title='My Retail Series') +
  guides(colour = guide_legend(title = 'Forecast'))

The dampened ETS is better with smaller RMSE and AIC.

Check that the residuals from the best method look like white noise.

fit_myseries|>select(`Damped Holt's method M`)|>gg_tsresiduals()

One or more large spikes are outside the bounds of ACF, or if substantially more than 5% of spikes are outside these bounds, then the series is probably not white noise.

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

myseries_train <- myseries %>%
  filter(year(Month) < 2011)

fit_myseries_train <- myseries_train |>
  model(
    `SNAIVE`=SNAIVE(Turnover),
    `Holt's method M`=ETS(Turnover ~ error("M") + trend("A") + season("M")),
    `Damped Holt's method M`=ETS(Turnover ~ error("M") + trend("Ad", phi=0.9) + season("M")),)

report(fit_myseries_train)

accuracy(fit_myseries_train)|>pull(6,3)

##                 SNAIVE        Holt's method M Damped Holt's method M 
##               26.21614               12.51802               12.03102

According to RMSE, the best model is Damped Holt’s method which beat SNAIVE model.

Ex 8.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

lambda_t <- myseries_train %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

fit_myseries_train2 <- myseries_train |>
  model(
    `STL decomposition w/ box_cox`= STL(box_cox(Turnover, lambda_t) ~ trend(window = 21)+season(window = "periodic"), robust = TRUE),
    `Holt's method M`= ETS(box_cox(Turnover, lambda_t) ~ error("M") + trend("A") + season("M")),
    `Damped Holt's method M`= ETS(Turnover ~ error("M") + trend("Ad", phi=0.9) + season("M")))

accuracy(fit_myseries_train2)|>pull(6,3)

## STL decomposition w/ box_cox              Holt's method M 
##                     11.00559                     11.56685 
##       Damped Holt's method M 
##                     12.03102

Surprisingly, the STL decomposition with box_cox transformation has the lowest RMSE, which is then the best model.