DATA 624 Homework 5

Author

Henock Montcho

Published

May 11, 2025

1-) Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

a. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and ℓ0, and generate forecasts for the next four months.

library(fpp3)
library(ggplot2)

#Use the ETS() function to estimate the equivalent model for simple exponential smoothing
fit <- aus_livestock  |>
  filter(State == "Victoria",
         Animal == "Pigs")  |>
  model(ETS(Count ~ error("A") + trend("N") + season("N"))) 

#Generate forecasts for the next four months
fc <- fit  |>
  forecast(h = 4)

fc  |>
  autoplot(aus_livestock) +
  ggtitle("Number of Pigs Slaughtered in Victoria")

#Find the optimal values of α and ℓ0
report(fit)
Series: Count 
Model: ETS(A,N,N) 
  Smoothing parameters:
    alpha = 0.3221247 

  Initial states:
     l[0]
 100646.6

  sigma^2:  87480760

     AIC     AICc      BIC 
13737.10 13737.14 13750.07 

The optimal values are:

  • α=0.3221247

  • ℓ0=100646.6

These are the forecasts for the next four months:

fc
# A fable: 4 x 6 [1M]
# Key:     Animal, State, .model [1]
  Animal State    .model                                                   Month
  <fct>  <fct>    <chr>                                                    <mth>
1 Pigs   Victoria "ETS(Count ~ error(\"A\") + trend(\"N\") + season(… 2019 janv.
2 Pigs   Victoria "ETS(Count ~ error(\"A\") + trend(\"N\") + season(… 2019 févr.
3 Pigs   Victoria "ETS(Count ~ error(\"A\") + trend(\"N\") + season(…  2019 mars
4 Pigs   Victoria "ETS(Count ~ error(\"A\") + trend(\"N\") + season(…  2019 avr.
# ℹ 2 more variables: Count <dist>, .mean <dbl>

b. Compute a 95% prediction interval for the first forecast using y^±1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

s <- residuals(fit)$.resid  |>
  sd()
y_hat <- fc$.mean[1]

#Computing y_hat ± 1.96s
lower_limit <- y_hat - 1.96*s
upper_limit <- 1.96*s + y_hat
cat("The confidence interval with y_hat ± 1.96s is [",lower_limit, ", ", upper_limit,"]\n")
The confidence interval with y_hat ± 1.96s is [ 76871.01 ,  113502.1 ]
#Interval produced by R
fc  |>
  hilo(95)  |>
  pull('95%')  |>
  head(1)
<hilo[1]>
[1] [76854.79, 113518.3]95

Comment: the 95% prediction interval for the first forecast using y^±1.96s is slightly different from the one produced by R: higher lower limit and smaller upper limit.

2-) Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

a. Plot the Exports series and discuss the main features of the data.

global_economy  |>
  filter(Country == "Benin")  |>
  autoplot(Exports) +
  labs(y="% of GDP", title="Exports: Benin")

Comment: Overall, there is an increasing trend throughout the years with no apparent seasonality. There is an important increase for almost a decade after 1960; when the country became independent from France. From there, the increase has been a steady trend from 1970 through 1990. Same observation from 1991 through 2010.

b. Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

fit_bj <- global_economy  |>
  filter(Country == "Benin")  |>
  model(ETS(Exports ~ error("A") + trend("N") + season("N"))) 

fc_bj <- fit_bj  |>
  forecast(h = 5)

fc_bj  |>
  autoplot(global_economy) +
  labs(y="% of GDP", title="Exports: Benin", subtitle = "ETS(A,N,N)")

c. Compute the RMSE values for the training data.

sbj <- accuracy(fit_bj)$RMSE
sbj
[1] 2.687801

d. Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

fit_bj_2 <- global_economy  |>
  filter(Country == "Benin")  |>
  model(ETS(Exports ~ error("A") + trend("A") + season("N")))

fc_bj_2 <- fit_bj_2  |>
  forecast(h = 5)

fc_bj_2  |>
  autoplot(global_economy) +
  labs(y="% of GDP", title="Exports: Benin", subtitle = "ETS(A,A,N)")

sbj2 <- accuracy(fit_bj_2)$RMSE
sbj2
[1] 2.676677

Comment:

  • ETS(A,A,N) Merits:

    • Captures Trend: This model includes an additive trend component, making it suitable for time series data with a linear trend. It can effectively model changes in the level of the series over time.

    • Flexibility: The additive trend allows for more flexibility in forecasting, as it can adapt to changes in the trend over time.

    • Benchmark for Trended Series: Due to its ability to capture trends, ETS(A,A,N) is often considered a good benchmark for trended time series.

  • ETS(A,N,N) Merits:

    • Simplicity: This model is simpler as it only includes an additive error component and no trend or seasonality. It is easy to implement and understand. For our data set, there is an observed trend and not capturing it in our model, would make the forecasting less efficient.

    • Suitable for Stable Series: ETS(A,N,N) is ideal for time series data that do not exhibit any trend or seasonal patterns. It works well for stable series where the level remains relatively constant over time. There is a trend in our data set.

    • Efficient: The simplicity of the model makes it computationally efficient, which can be advantageous when dealing with large datasets or requiring quick forecasts.

e. Compare the forecasts from both methods. Which do you think is best?

The RMSE is lower for the ETS(A,A,N), which suggests that this is model forecasts better. I believe that the ETS(A,A,N) method is better since it captures the increasing trend in the data and has a lower RMSE.

f. Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

- The confidence interval using the ETS(A,N,N) method:

s_bj <- residuals(fit_bj)$.resid  |>
  sd()
y_hat_bj <- fc_bj$.mean[1]

#Computing y_hat_bj ± 1.96s
lower_limit_bj <- y_hat_bj - 1.96*s_bj
upper_limit_bj <- 1.96*s_bj + y_hat_bj
cat("The confidence interval with y_hat_bj ± 1.96s is [",lower_limit_bj, ", ", upper_limit_bj,"]\n")
The confidence interval with y_hat_bj ± 1.96s is [ 21.76137 ,  32.26552 ]
#Interval produced by R

fc_bj  |>
  hilo(95)  |>
  pull('95%')  |>
  head(1)
<hilo[1]>
[1] [21.65221, 32.37468]95

- The confidence interval using the ETS(A,A,N) method:

s_bj_2 <- residuals(fit_bj_2)$.resid  |>
  sd()
y_hat_bj_2 <- fc_bj_2$.mean[1]

#Computing y_hat_bj ± 1.96s
lower_limit_bj_2 <- y_hat_bj_2 - 1.96*s_bj_2
upper_limit_bj_2 <- 1.96*s_bj_2 + y_hat_bj_2
cat("The confidence interval with y_hat_bj_2 ± 1.96s is [",lower_limit_bj_2, ", ", upper_limit_bj_2,"]\n")
The confidence interval with y_hat_bj_2 ± 1.96s is [ 22.07896 ,  32.6629 ]
#Interval produced by R

fc_bj_2  |>
  hilo(95)  |>
  pull('95%')  |>
  head(1)
<hilo[1]>
[1] [21.93391, 32.80796]95

Comment:

- ETS(A,N,N) method vs. R method: when comparing my interval from the ETS(A,N,N) method with the one produced using R, I observed that the spread (difference between the limits of the confidence intervals) using the ETS(A,N,N) is lower than the spread using the R method (10.5 vs. 10.72).

- ETS(A,A,N) method vs. R method: when comparing my interval from the ETS(A,A,N) method with the one produced using R, I observed that the spread (difference between the limits of the confidence intervals) using the ETS(A,A,N) is lower than the spread using the R method (10.58 vs. 10.98).

3-) Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]

Let’s explore the data set

global_economy  |>
  filter(Country == "China")  |>
  autoplot(GDP) +
  labs(y=" GDP", title="GDP: China")

Comment: The Chinese GDP exhibits an upward trend mainly after 1990 without any seasonality. Therefore, using additive errors and “N” for seasonality would be appropriate.

lambda <- global_economy  |>
  filter(Country == "China")  |>
  features(GDP, features = guerrero)  |>
  pull(lambda_guerrero)

fit_ch <- global_economy  |>
  filter(Country == "China")  |>
  model(`Simple` = ETS(GDP ~ error("A") + trend("N") + season("N")),
        `Holt's method` = ETS(GDP ~ error("A") + trend("A") + season("N")),
        `Damped Holt's method` = ETS(GDP ~ error("A") + trend("Ad", phi = 0.8) + season("N")),
        `Box-Cox` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("A") + season("N")),
        `Box-Cox Damped` = ETS(box_cox(GDP,lambda) ~ error("A") + trend("Ad", phi = 0.8) + season("N")),
        `Log` = ETS(log(GDP) ~ error("A") + trend("A") + season("N")),
        `Log Damped` = ETS(log(GDP) ~ error("A") + trend("Ad", phi = 0.8) + season("N"))
        ) 

fc_ch <- fit_ch  |>
  forecast(h = 20)

fc_ch  |>
  autoplot(global_economy, level = NULL) +
  labs(title="GDP: China") +
  guides(colour = guide_legend(title = "Forecast"))

Comment: Experimenting with the damping parameter phi in the damped trend models was intriguing, as it influenced the forecast’s intensity. When phi was set to 0.95, the forecasts appeared larger. To make the forecasts more realistic, I opted for phi at 0.8, considering the large values of GDP. Log and Box-Cox transformations without damping tended to over-forecast significantly as the forecast horizon increased. Both the damped log and Holt’s method with phi=0.8 produced similar effects. The damped Holt’s method without any transformation provided results that were between those of Holt’s method and the simple exponential method.

4-) Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

fit_gas <- aus_production  |>
  model(additive = ETS(Gas ~ error("A") + trend("A") + season("A")),
        multiplicative = ETS(Gas ~ error("M") + trend("A") + season("M")),
        `damped multiplicative` = ETS(Gas ~ error("M") + trend("Ad", phi = 0.9) + season("M"))) 

aus_production  |>
   model(additive = ETS(Gas ~ error("A") + trend("A") + season("A")),
        multiplicative = ETS(Gas ~ error("M") + trend("A") + season("M")))  |>
   forecast(h=20)  |>
  autoplot(aus_production, level = NULL) +
  labs(title="Australian Gas Production") +
  guides(colour = guide_legend(title = "Forecast"),
         subtitle="Additive vs. Multiplicative Seasonality")

aus_production  |>
  model(multiplicative = ETS(Gas ~ error("M") + trend("A") + season("M")),
        `damped multiplicative` = ETS(Gas ~ error("M") + trend("Ad", phi = 0.9) + season("M"))
        )  |>
  forecast(h=20)  |>
  autoplot(aus_production, level= NULL) +
  labs(title="Australian Gas Production") +
  guides(colour = guide_legend(title = "Forecast"),
         Subtitle = "Additive vs Damped Trend")

report(fit_gas)
# A tibble: 3 × 9
  .model                  sigma2 log_lik   AIC  AICc   BIC   MSE  AMSE    MAE
  <chr>                    <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
1 additive              23.6       -927. 1872. 1873. 1903.  22.7  29.7 3.35  
2 multiplicative         0.00324   -831. 1681. 1682. 1711.  21.1  32.2 0.0413
3 damped multiplicative  0.00340   -835. 1688. 1689. 1719.  21.0  32.4 0.0424

Comment: Multiplicative seasonality is essential in this case because the seasonal pattern’s variation seems to be proportional to the time series level. As the trend increases, the amplitude of the seasonal pattern also grows.

accuracy(fit_gas)
# A tibble: 3 × 10
  .model            .type       ME  RMSE   MAE    MPE  MAPE  MASE RMSSE     ACF1
  <chr>             <chr>    <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>    <dbl>
1 additive          Trai…  0.00525  4.76  3.35 -4.69  10.9  0.600 0.628  0.0772 
2 multiplicative    Trai… -0.115    4.60  3.02  0.199  4.08 0.542 0.606 -0.0131 
3 damped multiplic… Trai…  0.255    4.58  3.04  0.655  4.15 0.545 0.604 -0.00451

Comment: Although the RMSE improved with the damped multiplicative method, the AIC slightly worsened with each method. When deciding whether dampening enhanced the model, you can base your judgment on either the RMSE or the AIC, AICc, and BIC.

Since all three methods have the same number of parameters to estimate, using RMSE for model comparison is appropriate. This comparison shows that the damped multiplicative method saw a slight improvement.

5-) Recall your retail time series data (from Exercise 7 in Section 2.10).

a. Why is multiplicative seasonality necessary for this series?

Multiplicative seasonality is essential in this context because the seasonal pattern’s variation seems to be proportional to the time series level. As the trend increases over time, the amplitude of the seasonality also grows.

b. Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

set.seed(1234)
myseries <- aus_retail  |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1)) 

fit_my <- myseries  |>
  model(multiplicative = ETS(Turnover ~ error("M") + trend("A") + season("M")),
        `damped multiplicative` = ETS(Turnover ~ error("M") + trend("Ad") + season("M"))) 

fit_my  |>
  forecast(h=36)  |>
  autoplot(myseries, level = NULL) +
  labs(title="Australian Retail Turnover") +
  guides(colour = guide_legend(title = "Forecast"))

Comment: The damped method appears to be more stable, whereas the multiplicative method shows a greater increase in trend and results in higher forecasts.

c. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

accuracy(fit_my)  |>
  select(.model, RMSE)
# A tibble: 2 × 2
  .model                 RMSE
  <chr>                 <dbl>
1 multiplicative         1.34
2 damped multiplicative  1.36

Comment: The RMSE is slightly lower for the non-damped method, making it the preferred choice.

d. Check that the residuals from the best method look like white noise.

myseries  |>
  model(multiplicative = ETS(Turnover ~ error("M") + trend("A") + season("M")))  |>
  gg_tsresiduals() +
  ggtitle("Multiplicative Method")

#Box-Pierce test, ℓ=2m for seasonal data, m=12
myseries  |>
  model(multiplicative = ETS(Turnover ~ error("M") + trend("A") + season("M")))  |>
  augment()  |>
  features(.innov, box_pierce, lag = 24, dof = 0)
# A tibble: 1 × 5
  State    Industry                                     .model bp_stat bp_pvalue
  <chr>    <chr>                                        <chr>    <dbl>     <dbl>
1 Tasmania Cafes, restaurants and takeaway food servic… multi…    27.2     0.296
#Ljung-Box test
myseries  |>
  model(multiplicative = ETS(Turnover ~ error("M") + trend("A") + season("M")))  |>
  augment()  |>
  features(.innov, ljung_box, lag = 24, dof = 0)
# A tibble: 1 × 5
  State    Industry                                     .model lb_stat lb_pvalue
  <chr>    <chr>                                        <chr>    <dbl>     <dbl>
1 Tasmania Cafes, restaurants and takeaway food servic… multi…    28.0     0.260

Comment: For both tests, the results are not significant at the 0.05 level, indicating that the residuals are indistinguishable from a white noise series. The residuals indeed resemble white noise.

e. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

myseries_train <- myseries  |>
  filter(year(Month) < 2011)

fit_train <- myseries_train  |>
  model(multi = ETS(Turnover ~ error("M") + trend("A") + season("M")),
        snaive = SNAIVE(Turnover))

#producing forecasts
fc <- fit_train  |>
  forecast(new_data = anti_join(myseries, myseries_train))

fc  |>
  autoplot(myseries, level = NULL)

accuracy(fit_train)  |>
  select(.type, .model, RMSE)
# A tibble: 2 × 3
  .type    .model  RMSE
  <chr>    <chr>  <dbl>
1 Training multi   1.18
2 Training snaive  2.90
fc  |>
accuracy(myseries)  |>
  select(.type, .model, RMSE)
# A tibble: 2 × 3
  .type .model  RMSE
  <chr> <chr>  <dbl>
1 Test  multi   3.99
2 Test  snaive  9.13

Comment: The multiplicative method appears to forecast the data more effectively. Its RMSE is significantly lower than that of the seasonal naïve approach, indicating that the multiplicative method is more suitable.

6-) For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

lambda <- myseries_train  |>
  features(Turnover, guerrero)  |>
  pull(lambda_guerrero)

#stl decomp applied to the box cox transformed data
myseries_train  |>
  model(STL(box_cox(Turnover,lambda) ~ season(window = "periodic"), robust = TRUE))  |>
  components()  |>
  autoplot() +
  ggtitle("STL with Box-Cox")

#computed the seasonally adjusted data , stl decomp applied to the box cox transformed data
dcmp <- myseries_train  |>
  model(STL_box = STL(box_cox(Turnover,lambda) ~ season(window = "periodic"), robust = TRUE))  |>
  components()

#replacing turnover with the seasonally adjusted data
myseries_train$Turnover <- dcmp$season_adjust

#modeling on the seasonally adjusted data
fit <- myseries_train  |>
  model(ETS(Turnover ~ error("M") + trend("A") + season("M")))

#checking the residuals
fit  |>
  gg_tsresiduals()  +
  ggtitle("Residual Plots for Australian Retail Turnover")

#produce forecasts for test data
fc <- fit  |>
  forecast(new_data = anti_join(myseries, myseries_train))

fit  |>
  accuracy()  |>
  select(.model, .type, RMSE)
# A tibble: 1 × 3
  .model                                                        .type     RMSE
  <chr>                                                         <chr>    <dbl>
1 "ETS(Turnover ~ error(\"M\") + trend(\"A\") + season(\"M\"))" Training 0.141
fc  |>
  accuracy(myseries)  |>
  select(.model, .type, RMSE)
# A tibble: 1 × 3
  .model                                                        .type  RMSE
  <chr>                                                         <chr> <dbl>
1 "ETS(Turnover ~ error(\"M\") + trend(\"A\") + season(\"M\"))" Test   21.0
lambda <- myseries  |>
  features(Turnover, features = guerrero)  |>
  pull(lambda_guerrero)

#stl decomp applied to the box cox transformed data
myseries  |>
  model(STL(box_cox(Turnover,lambda) ~ season(window = "periodic"), robust = TRUE))  |>
  components()  |>
  autoplot() +
  ggtitle("STL with Box-Cox")

#computed the seasonally adjusted data , stl decomp applied to the box cox transformed data
dcmp <- myseries  |>
  model(STL_box = STL(box_cox(Turnover,lambda) ~ season(window = "periodic"), robust = TRUE))  |>
  components()

#replacing turnover with the seasonally adjusted data
myseries$Turnover_sa <- dcmp$season_adjust

myseries_train <- myseries  |>
  filter(year(Month) < 2011)

#modeling on the seasonally adjusted data
fit <- myseries_train  |>
  model(ETS(Turnover_sa ~ error("M") + trend("A") + season("M")))

#checking the residuals
fit  |>
  gg_tsresiduals()  +
  ggtitle("Residual Plots for Australian Retail Turnover")

#Box-Pierce test, ℓ=2m for seasonal data, m=12
myseries  |>
  model(multiplicative = ETS(Turnover_sa ~ error("M") + trend("A") + season("M")))  |>
  augment()  |>
  features(.innov, box_pierce, lag = 24, dof = 0)
# A tibble: 1 × 5
  State    Industry                                     .model bp_stat bp_pvalue
  <chr>    <chr>                                        <chr>    <dbl>     <dbl>
1 Tasmania Cafes, restaurants and takeaway food servic… multi…    36.3    0.0511
#Ljung-Box test
myseries  |>
  model(multiplicative = ETS(Turnover_sa ~ error("M") + trend("A") + season("M")))  |>
  augment()  |>
  features(.innov, ljung_box, lag = 24, dof = 0)
# A tibble: 1 × 5
  State    Industry                                     .model lb_stat lb_pvalue
  <chr>    <chr>                                        <chr>    <dbl>     <dbl>
1 Tasmania Cafes, restaurants and takeaway food servic… multi…    37.4    0.0395
#produce forecasts for test data
fc <- fit  |>
  forecast(new_data = anti_join(myseries, myseries_train))

fit  |>
  accuracy()  |>
  select(.model, .type, RMSE)
# A tibble: 1 × 3
  .model                                                           .type    RMSE
  <chr>                                                            <chr>   <dbl>
1 "ETS(Turnover_sa ~ error(\"M\") + trend(\"A\") + season(\"M\"))" Traini… 0.139
fc  |>
  accuracy(myseries)  |>
  select(.model, .type, RMSE)
# A tibble: 1 × 3
  .model                                                           .type  RMSE
  <chr>                                                            <chr> <dbl>
1 "ETS(Turnover_sa ~ error(\"M\") + trend(\"A\") + season(\"M\"))" Test  0.193

Comment: Using this model, the residuals do not resemble white noise. However, the RMSE is significantly improved on the test data.