Exercises:
8.1 Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.
8.5 Data set global_economy contains the annual Exports from many countries. Select one country to analyse.
8.6 Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.
8.7 Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?
8.8 Recall your retail time series data (from Exercise 7 in Section 2.10).
8.9 For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

#Import needed libraries

library(tsibbledata) #to use the time series data in it for the exercises.
library(tsibble) # to use datasets and function as_tsibble
library(tibble) # to use view function
library(ggplot2)
library(feasts) # to use the functions for graphics like autoplot()

library(readr) # to uses read_csv function
library(dplyr) # to use Filter, mutate, arrange function etc
library(tidyr) # to use pivot_longer function

library(fpp3)  # to use us_gasoline dataset 
library(seasonal) # X-13ARIMA-SEATS decomposition
library(feasts)

library(corrplot)

Exercises:

8.1 Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and ℓ0, , and generate forecasts for the next four months.
Compute a 95% prediction interval for the first forecast where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

Answer: a) I used the ETS() function to estimate the equivalent model for simple exponential smoothing. The optimal value of α is 0.322 and ℓ0 is 100646.6.The 4 month forecast is shown below.

Comparing the interval with the R’s prediction intervals, i see that there is a difference in Bottom of 23.78 and Top of 16.2 . The manual calculation only accounts for the historical error and assumes constant variance, while R’s approach is more comprehensive. R’s prediction intervals are typically wider because they account for additional sources of uncertainty.

“Calculated Prediction Interval: [76871.012, 113502.102]”

“R’s prediction intervals:[76854.7888896402, 113518.325972343]95”

# Load the dataset
victoria_pigs <- aus_livestock |>
filter(State == "Victoria" & Animal == "Pigs")

# victoria_pigs <- victoria_pigs |>  filter_index("1980 Jan" ~ .)


# Plot the original data
victoria_pigs |>
autoplot(Count) +
  labs(title= "Victoria Pigs production in Australia for human consumption", y = "Number of Victoria Pigs slaughtered")+
  scale_y_continuous(labels = scales::comma)

# Use the ETS() function to estimate the equivalent model for simple exponential smoothing
fit <- victoria_pigs |>
  model(SES = ETS(Count ~ error("A") + trend("N") + season("N")))

# find the optimal value of α  and  ℓ0
fit |> coef()  |> mutate(estimate = round(estimate,3))

## # A tibble: 2 × 5
##   Animal State    .model term    estimate
##   <fct>  <fct>    <chr>  <chr>      <dbl>
## 1 Pigs   Victoria SES    alpha      0.322
## 2 Pigs   Victoria SES    l[0]  100647.

#Generate forecasts for the next four months
fc <- fit |>
  forecast(h = 4)
  

fc |>
  autoplot(victoria_pigs, LEVL = NULL) +
  
  labs(y="Number of Victoria Pigs slaughtered", title="Forecast using SES for Victoria Pigs production in Australia for human consumption") +
  guides(colour = guide_legend(title = "Forecast"))

#Compute a 95% prediction interval for the first forecast

# to find the variance which is sigma^2
report(fit)

## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3221247 
## 
##   Initial states:
##      l[0]
##  100646.6
## 
##   sigma^2:  87480760
## 
##      AIC     AICc      BIC 
## 13737.10 13737.14 13750.07

# to calculate the standard deviation using the residuals
sd <- augment(fit) |>
  pull(.resid) |>
  sd()

# to calculate the prediction interval of the first forecast
upper <- round(fc$.mean[1] + 1.96 * sd, 3)
lower <- round(fc$.mean[1] - 1.96 * sd, 3)

# to predict the prediction interval
r_interval <- fc |> hilo(95) |> pull('95%') |> head(1)

# Compare your interval with the interval produced by R.
paste0("Calculated Prediction Interval: [", lower, ", ", upper, "]")

## [1] "Calculated Prediction Interval: [76871.012, 113502.102]"

paste0("Prediction Interval using R:", r_interval)

## [1] "Prediction Interval using R:[76854.7888896402, 113518.325972343]95"

8.5 Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

a) Plot the Exports series and discuss the main features of the data.

Answer:

The time plot for country Australia indicates a upward positive trend.The seasonality is not evident, However there are fluctuations between the years, indicating a possibility of seasonality within the years.

# Extract data of interest
aus_economy <- global_economy |>
  filter(Country == "Australia") |>
  select( Exports) 

# Plot the Exports series
autoplot(aus_economy) +
  labs( title = "Australia's export of goods and services ",  y = "Exports of goods and services (% of GDP)")

b) Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

Answer:

Used an ETS(A,N,N) model to forecast the series, and plotted the forecasts.

# Use the ETS(A,N,N) model  to estimate the equivalent model for simple exponential smoothing
fit_aus_economy <- aus_economy |>
  model(SES = ETS(Exports ~ error("A") + trend("N") + season("N")))

# find the optimal value of α  and  ℓ0
fit_aus_economy |> coef()  |> mutate(estimate = round(estimate,3))

## # A tibble: 2 × 3
##   .model term  estimate
##   <chr>  <chr>    <dbl>
## 1 SES    alpha    0.566
## 2 SES    l[0]    13.0

#Generate forecasts for the next four years
fc_aus_economy <- fit_aus_economy |>
  forecast(h = 4)
  
#plot the forecasts
fc_aus_economy |>
  autoplot(aus_economy, LEVEL = NULL) +
  
  labs(y="Exports of goods and services (% of GDP)", title="Forecast using SES for Australia's export of goods and services") +
  guides(colour = guide_legend(title = "Forecast"))

## Warning in ggdist::geom_lineribbon(without(intvl_mapping, "colour_ramp"), :
## Ignoring unknown parameters: `LEVEL`

## Warning in geom_line(mapping = without(mapping, "shape"), data =
## unpack_data(object[single_row[["FALSE"]], : Ignoring unknown parameters:
## `LEVEL`

c) Compute the RMSE values for the training data.

Answer: c) The RMSE values for the training data is 1.146794

accuracy(fit_aus_economy)

## # A tibble: 1 × 10
##   .model .type       ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1
##   <chr>  <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 SES    Training 0.232  1.15 0.914  1.09  5.41 0.928 0.928 0.0125

d) Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

Answer:

Comparing ETS(A,N,N) model and ETS(A,A,N) model results , i observe that ETS(A,N,N) model does not account for the trend and hence it predicts a fixed mean value or a flat forecast for all 4 years.The ETS(A,A,N) model accounts for the trend and hence predicts a increasing value for the next 4 years.

ETS(A,N,N) uses only one parameter and is hence simple.As it is robust it was be less prone to overfitting. ETS(A,A,N) is adaptable as it can adjust to gradual shifts in the underlying level.It captures trend and hence better equipped to model data with directional movement.

# Use the ETS(A,A,N) model  to estimate the equivalent model for simple exponential smoothing
fit_aus_economy_AAN <- aus_economy |>
  model(SES = ETS(Exports ~ error("A") + trend("A") + season("N")))

# find the optimal value of α  and  ℓ0
fit_aus_economy_AAN |> coef()  |> mutate(estimate = round(estimate,3))

## # A tibble: 4 × 3
##   .model term  estimate
##   <chr>  <chr>    <dbl>
## 1 SES    alpha    0.441
## 2 SES    beta     0    
## 3 SES    l[0]    12.8  
## 4 SES    b[0]     0.137

#Generate forecasts for the next four years
fc_aus_economy_AAN <- fit_aus_economy_AAN |>
  forecast(h = 4)
  
#plot the forecasts
fc_aus_economy_AAN |>
  autoplot(aus_economy, LEVL = NULL) +
  
  labs(y="Exports of goods and services (% of GDP)", title="Forecast using SES for Australia's export of goods and services") +
  guides(colour = guide_legend(title = "Forecast"))

## Warning in ggdist::geom_lineribbon(without(intvl_mapping, "colour_ramp"), :
## Ignoring unknown parameters: `LEVL`

## Warning in geom_line(mapping = without(mapping, "shape"), data =
## unpack_data(object[single_row[["FALSE"]], : Ignoring unknown parameters: `LEVL`

accuracy(fit_aus_economy_AAN)

## # A tibble: 1 × 10
##   .model .type              ME  RMSE   MAE    MPE  MAPE  MASE RMSSE  ACF1
##   <chr>  <chr>           <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 SES    Training -0.000000746  1.12 0.893 -0.387  5.39 0.907 0.904 0.109

e) Compare the forecasts from both methods. Which do you think is best?

Answer:

As there is a visible trend in the data, ETS(A,A,N) should theoretically perform better. Based on the computation of the accuracy: RMSE is lower for it,indicating better overall accuracy,MAE is lower suggesting better performance in absolute error terms and MAPE is lower showing slightly better percentage accuracy. Hence i think the ETS(A,A,N) model consistently outperforms ETS(A,N,N) across all error metrics, though the improvements are relatively modest.

ETS(A,N,N):

RMSE, MAE, MAPE

1.146794 , 0.913583, 5.406935

ETS(A,A,N):

RMSE, MAE, MAPE

1.116727 , 0.892642, 5.392172

f) Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

Answer:

ETS(A,N,N):

Comparing the interval with the R’s prediction intervals, i see that there is a difference in Bottom of 0.04 and top of 0.039. Hence its very close.

“Calculated Prediction Interval: [18.359, 22.855]”

“R’s Prediction Interval:[18.32, 22.895]95”

ETS(A,A,N):

Comparing the interval with the R’s prediction intervals, i see that there is a difference in Bottom of 0.08 and Top of 0.04 . Hence its very close.

“Calculated Prediction Interval: [18.65, 23.027]”

“R’s Prediction Interval :[18.57, 23.107]95”

# ETS(A,N,N):
# to calculate the RMSE
RMSE <- accuracy(fit_aus_economy) |>
  pull(RMSE) 

# to calculate the prediction interval of the first forecast
upper <- round(fc_aus_economy$.mean[1] + 1.96 * RMSE, 3)
lower <- round(fc_aus_economy$.mean[1] - 1.96 * RMSE, 3)

# to predict the prediction interval
r_interval <- fc_aus_economy |> hilo(95) |> pull('95%') |> head(1) |> round (3)

# Compare your interval with the interval produced by R.
paste0("ETS(A,N,N) Calculated Prediction Interval: [", lower, ", ", upper, "]")

## [1] "ETS(A,N,N) Calculated Prediction Interval: [18.359, 22.855]"

paste0("ETS(A,N,N) Prediction Interval using R:", r_interval)

## [1] "ETS(A,N,N) Prediction Interval using R:[18.32, 22.895]95"

# ETS(A,A,N):
# to calculate the RMSE
RMSE <- accuracy(fit_aus_economy_AAN) |>
  pull(RMSE) 

# to calculate the prediction interval of the first forecast
upper <- round(fc_aus_economy_AAN$.mean[1] + 1.96 * RMSE, 3)
lower <- round(fc_aus_economy_AAN$.mean[1] - 1.96 * RMSE, 3)

# to predict the prediction interval
r_interval <- fc_aus_economy_AAN |> hilo(95) |> pull('95%') |> head(1) |> round (3)

# Compare your interval with the interval produced by R.
paste0("ETS(A,A,N) Calculated Prediction Interval: [", lower, ", ", upper, "]")

## [1] "ETS(A,A,N) Calculated Prediction Interval: [18.65, 23.027]"

paste0("ETS(A,A,N) Prediction Interval using R:", r_interval)

## [1] "ETS(A,A,N) Prediction Interval using R:[18.57, 23.107]95"

8.6 Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]

Answer:

The plot before transformation shows a Exponential upward trend.the variance increases with the level of the series.

The lambda value of -0.063 indicates that the data needs a transformation to make it more suitable for analysis and forecasting.Transformation needed is strong as a log transformation as its close to λ = 0. The log transformed plot is better defined and it shows few variations and dips.

As its exponential trend, I will use additive error with multiplicative trend. i will also check other options: ETS(M,M,N), ETS(A,M,N) , ETS(M,Ad,N) , ETS(A,Ad,N) and Boxcox.

Based on the plot and the accuracy results, i observe that ETS(A,Ad,N) has the least RMSE and MAE . BOXLOG has the least MAPE.

# Extract data of interest 
china_economy <- global_economy |>
  filter(Country == "China") |>
  select (GDP)

# Plot the GDP series
autoplot(china_economy) +
  labs( title = "China's Gross domestic product (in $USD) ",  y = "Gross domestic product (in $USD)") +
  scale_x_continuous(breaks = seq(min(global_economy$Year), max(global_economy$Year), by = 5))+
  scale_y_continuous(labels = scales::comma)

lambda <- china_economy |>
  features(GDP, features = guerrero) |>
  pull(lambda_guerrero)

china_economy |>
  autoplot(box_cox(GDP, lambda)) +
  labs(y = "",title = paste("China's Log Transformed GDP with λ = ",round(lambda,2)))+
  scale_x_continuous(breaks = seq(min(global_economy$Year), max(global_economy$Year), by = 5))

# Use the ETS(M,M,N) model  to estimate the equivalent model for simple exponential smoothing
fit_china_economy <- china_economy |>
  model(MMN = ETS(GDP ~ error("M") + trend("M") + season("N")),
        AMN = ETS(GDP ~ error("A") + trend("M") + season("N")),
        MADN = ETS(GDP ~ error("M") + trend("Ad") + season("N")),
        AADN = ETS(GDP ~ error("A") + trend("Ad") + season("N")),
        BOXLOG = ETS(box_cox(GDP,lambda)))

#Generate forecasts for the next four years
fc_china_economy <- fit_china_economy |>
  forecast(h = 15)
  
#plot the forecasts
fc_china_economy |>
  autoplot(china_economy, .level=numeric(0)) +
  labs(y = "Gross domestic product (in $USD)", 
       title = "Forecast of China's Gross domestic product (in $USD) using various ETS methods") +
  theme_minimal()+
  scale_y_continuous(labels = scales::comma)+
  facet_wrap(vars(.model))

accuracy(fit_china_economy)

## # A tibble: 5 × 10
##   .model .type             ME    RMSE     MAE     MPE  MAPE  MASE RMSSE     ACF1
##   <chr>  <chr>          <dbl>   <dbl>   <dbl>   <dbl> <dbl> <dbl> <dbl>    <dbl>
## 1 MMN    Training    -1.08e10 2.67e11 1.16e11 -0.0321  7.29 0.534 0.637  0.634  
## 2 AMN    Training    -2.79e10 2.23e11 1.07e11 -0.734   8.20 0.496 0.533 -0.00283
## 3 MADN   Training     4.65e10 2.00e11 9.70e10  2.32    7.79 0.447 0.476  0.236  
## 4 AADN   Training     2.95e10 1.90e11 9.49e10  1.62    7.62 0.438 0.454 -0.00187
## 5 BOXLOG Training    -2.75e10 2.88e11 1.25e11  0.607   7.17 0.578 0.688  0.665

8.7 Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

Answer: The time series shows a upward trend with seasonality behavior. Also the magnitude of change increases indicating the seasonal fluctuations that grow with the level of the series.Hence there is multiplicative seasonality.

The lambda value of 0.11 indicates that the data needs a strong transformation as a log transformation (λ = 0)to make it more suitable for analysis and forecasting.

Based on the accuracy of the various ETS models, i observe that AAM (Additive Error, Additive Trend, Multiplicative seasonality) has the lowest RMSE. Applying the damped method does not improve the forecast as its RMSE is higher. Also it does not make sense as the trend is upward in general.

#original time series
aus_production |>
  autoplot(Gas) +
  labs ( title = "Quarterly production of Gas in Australia", y = "Gas production in petajoules")

# Estimate the lambda
lambda <- aus_production |>
  features(Gas, features = guerrero) |>
  pull(lambda_guerrero)

# perform log transformation
aus_production |>
  autoplot(log(Gas)) +
  labs ( title = "Quarterly production of Gas in Australia (log transformed)", y = "Log(Gas production in petajoules)")

# Use the ETS modelS  to estimate the equivalent model for simple exponential smoothing
fit_aus_production <- aus_production |>
  model(AMM = ETS(Gas ~ error("A") + trend("M") + season("M")),
        AAM = ETS(Gas ~ error("A") + trend("A") + season("M")),
        AADM = ETS(Gas ~ error("A") + trend("Ad") + season("M")),
        BOXLOG = ETS(box_cox(Gas,lambda)))

#Generate forecasts for the next 12 Quarters
fc_aus_production <- fit_aus_production |>
  forecast(h = "5 years")
  
#plot the forecasts
fc_aus_production |>
  autoplot(aus_production, .level=numeric(0)) +
  labs(y = "Gas production in petajoules", 
       title = "Forecast of Quarterly production of Gas in Australia using various ETS methods") +
  theme_minimal()+
  scale_y_continuous(labels = scales::comma)+
  facet_wrap(vars(.model))

accuracy(fit_aus_production)

## # A tibble: 4 × 10
##   .model .type        ME  RMSE   MAE    MPE  MAPE  MASE RMSSE    ACF1
##   <chr>  <chr>     <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
## 1 AMM    Training  0.465  4.23  2.84  1.26   4.12 0.509 0.558  0.0321
## 2 AAM    Training  0.218  4.19  2.84 -0.920  5.03 0.510 0.553  0.0405
## 3 AADM   Training  0.548  4.22  2.81  1.32   4.11 0.505 0.556  0.0265
## 4 BOXLOG Training -0.164  4.45  2.92 -0.163  3.82 0.523 0.587 -0.121

8.8 Recall your retail time series data (from Exercise 7 in Section 2.10).

a) Why is multiplicative seasonality necessary for this series?

b) Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

c) Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

d) Check that the residuals from the best method look like white noise.

e) Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

Answer: a) The time series is indicating a changing direction trend as its on a upward trend from 1982 to 2002. Then it is decreasing till 2006 and so on. The seasonality show the retail turnover is highest in December over the years and relatively lowest in February. The magnitude of change increases indicating the seasonal fluctuations that grow with the level of the series.Hence multiplicative seasonality is essential for this series.

After applying Holt-Winters’ multiplicative method to the data, i see the two models performing very close to each other.However MADM looks better.
RMSE of MAM is 4.052202 and RMSE of MAdM is 4.014168 . Hence MAdM model performs better.
The residual results are white noise, as the residuals seem to be centered around zero and follow a constant variance except for 2 peaks which are the lowest. The histogram of the residuals displays a normal distribution, centered at 0. The p-value of 0.24 which is > 0.05 significance level, indicates that there is no autocorrelation in the residuals. This means the residuals are white noise.
Both exponential smoothing approaches produced a better forecast than the naive approach. The Damped Multiplicative model (MAdM) was only a slightly better fit than the naive method, whereas the Multiplicative model(MAM) performed the best on the test data.

set.seed(12345698)
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

autoplot(myseries,Turnover)+
  labs( y = "Retail turnover in $Million AUD", title = "Australian retail trade turnover")

gg_season(myseries,Turnover)+
  labs( y = "Retail turnover in $Million AUD", title = "Australian retail trade turnover")

# Use the ETS modelS  to estimate the equivalent model for simple exponential smoothing
fit_myseries <- myseries |>
  model(MAM = ETS(Turnover ~ error("M") + trend("A") + season("M")),
        MADM = ETS(Turnover ~ error("M") + trend("Ad") + season("M")))

#Generate forecasts for the next 10 years
fc_myseries <- fit_myseries |>
  forecast(h = 96)
  
#plot the forecasts
fc_myseries |>
  autoplot(myseries, .level=numeric(0)) +
  labs(y = "Retail turnover in $Million AUD", 
       title = "Forecast of Australian retail trade turnover using various ETS methods") +
  theme_minimal()+
  scale_y_continuous(labels = scales::comma)+
  facet_wrap(vars(.model))

accuracy(fit_myseries)

## # A tibble: 2 × 12
##   State  Industry .model .type     ME  RMSE   MAE    MPE  MAPE  MASE RMSSE  ACF1
##   <chr>  <chr>    <chr>  <chr>  <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 South… Hardwar… MAM    Trai… 0.0334  4.05  2.90 -0.231  6.10 0.473 0.471 0.282
## 2 South… Hardwar… MADM   Trai… 0.211   4.01  2.86  0.353  5.95 0.465 0.467 0.190

# Look at the residuals for MAM
fit_myseries |> select(MAM) |>  
  gg_tsresiduals()

# Look at the residuals for MAM
fit_myseries |> select(MADM) |>  
  gg_tsresiduals()

# test box_pierce
fit_myseries |> select(MAM) |>  
 augment( fit_myseries ) |> features(.innov, box_pierce, lag = 12)

## # A tibble: 1 × 3
##   .model bp_stat bp_pvalue
##   <chr>    <dbl>     <dbl>
## 1 MAM       15.0     0.241

# define the train and test data
myseries_train <- myseries |>
  filter(year(Month) < 2011)
myseries_test <- myseries |> 
  filter(year(Month) > 2010)

# comparing SNaive with the two multiplicative models
fits <- myseries_train |>
  model(
  SNaive = SNAIVE(Turnover),
  MAM = ETS(Turnover ~ error("M") + trend("A") + season("M")),
  MAdM = ETS(Turnover ~ error("M") + trend("Ad") + season("M"))      
)

fc <- fits |>
  forecast(myseries_test)


fc |>
  autoplot(myseries, level = NULL) +
  facet_wrap(vars(.model))

8.9 For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

Answer: The lambda value of 0.455 indicates the needs for a transformation to make it more suitable for analysis and forecasting.Transformation needed is a square root transformation (λ = 0.5).

The Box-Cox square root transformation has a smoother trend and smaller residuals centered around 0. Comparing the results below with the previous forecasts on the test set, i observe that the same MAM method fared better as it is inline with the upward trend and seasonality.

# Estimate the lambda
lambda <- myseries_train |>
  features(Turnover, features = guerrero) |>
  pull(lambda_guerrero)

myseries_train |>
  model(STL(box_cox(Turnover, lambda = lambda))) |>
  components() |>
  autoplot()

# Apply the Box-Cox transformation to the monthly training data
myseries_train_bc <- myseries_train |>
  mutate(turnover_bc = box_cox(Turnover,lambda))

# Fit the STL decomposition model on the transformed data
stl_fit_abt <- myseries_train_bc |>
  model(STL_turnover = STL(turnover_bc ~ trend() + season(window = "periodic")))

# Extract the seasonally adjusted component
season_adjust <- stl_fit_abt |>
  components() |>  # Get decomposition components
  select(Month, season_adjust)

ets_fit <- season_adjust |>
  model(MAM = ETS(season_adjust ~ error("M") + trend("A") + season("M")), 
        MAdM = ETS(season_adjust ~ error("M") + trend("Ad") + season("M")))


ets_forecast <- ets_fit |>
  forecast(h = 96)

ets_forecast |>
  autoplot(season_adjust, level=NULL) +
  labs(title = "10 Year Seasonsally Adjusted Aussie Retail Turnover Forecast",
       x = "Year",
       y = "Adjusted") +
  scale_y_continuous(labels = scales::comma_format()) +
  theme_minimal()

Homework 5 - Exponential Smoothing

Dhanya Nair

2025-03-08

Exercises:

8.1 Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

8.5 Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

a) Plot the Exports series and discuss the main features of the data.

b) Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

c) Compute the RMSE values for the training data.

d) Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

e) Compare the forecasts from both methods. Which do you think is best?

f) Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

8.6 Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

8.7 Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

8.8 Recall your retail time series data (from Exercise 7 in Section 2.10).

a) Why is multiplicative seasonality necessary for this series?

b) Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

c) Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

d) Check that the residuals from the best method look like white noise.

e) Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

8.9 For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?