HW5DATA624

Author

Semyon Toybis

Assignment

We are required to complete questions 8.1, 8.5, 8.6, 8.7, 8.8, and 8.9 from chapter 8 of “Forecasting: Principles and Practice” Third Edition by Rob Hyndman and George Athanasopoulos.

8.1 - Victoria pigs from aus_livestock

aus_livestock |> filter(Animal=='Pigs', State=='Victoria') |> autoplot()
Plot variable not specified, automatically selected `.vars = Count`

A - ETS

We are tasked with using the ETF function to estimate the equivalent model for simple exponential smoothing.

victoria_pigs <- aus_livestock |> filter(Animal=='Pigs', State=='Victoria')

victoria_pigs_fit <- victoria_pigs |> model(ETS(Count))

report(victoria_pigs_fit)
Series: Count 
Model: ETS(A,N,A) 
  Smoothing parameters:
    alpha = 0.3579401 
    gamma = 0.0001000139 

  Initial states:
    l[0]     s[0]    s[-1]     s[-2]     s[-3]     s[-4]     s[-5]   s[-6]
 95487.5 2066.235 6717.311 -2809.562 -1341.299 -7750.173 -8483.418 5610.89
     s[-7]    s[-8]     s[-9]   s[-10]   s[-11]
 -579.8107 1215.464 -2827.091 1739.465 6441.989

  sigma^2:  60742898

     AIC     AICc      BIC 
13545.38 13546.26 13610.24 

Based on the fit of the model, the optimal value of alpha is 0.36 (rounded) and lambda_0 is 95,488.

Next, I generate a forecast for the next four months:

victoria_pigs_forecast <- victoria_pigs_fit |> forecast(h = 4, level = 95) 
victoria_pigs_forecast |> autoplot(victoria_pigs)

B - confidence interval

Below I calculate the 95% confidence interval for the first forecast using the standard deviation of the residuals via the formula: y_hat +/- 1.96s

upper_95 <- victoria_pigs_forecast[1,6] + (1.96 * sqrt(60742898))
lower_95 <- victoria_pigs_forecast[1,6] - (1.96 * sqrt(60742898))

print(c(upper_95, lower_95))
$.mean
[1] 99700.5

$.mean
[1] 69148.91

Below is the confidence interval for the forecast generated by R:

victoria_pigs_forecast |> mutate(ci = hilo(Count, 95))
# A fable: 4 x 7 [1M]
# Key:     Animal, State, .model [1]
  Animal State  .model    Month             Count  .mean                      ci
  <fct>  <fct>  <chr>     <mth>            <dist>  <dbl>                  <hilo>
1 Pigs   Victo… ETS(C… 2019 Jan N(84425, 6.1e+07) 84425. [69149.19,  99700.22]95
2 Pigs   Victo… ETS(C… 2019 Feb N(85158, 6.9e+07) 85158. [68933.39, 101382.56]95
3 Pigs   Victo… ETS(C… 2019 Mar N(91567, 7.6e+07) 91567. [74445.66, 108687.93]95
4 Pigs   Victo… ETS(C… 2019 Apr N(90098, 8.4e+07) 90098. [72125.42, 108071.45]95

The confidence interval generated by R is almost the same as the manual calculation (difference of less than one).

8.5 - export data from global_economy

A

I chose to examine the exports of Japan:

global_economy |> filter(Country=='Japan') |> select(Exports) |> autoplot()
Plot variable not specified, automatically selected `.vars = Exports`
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

We can see that the series has non constant variance periods of positive and negative longer term trends, and cyclicity.

B, C, D, E

We are asked to use an ETS(A, N, N) model to forecast the series as well as an ETS (A, A, N) model, compute the RMSE values for the training data, and compare the forecasts.

For my training data, I choose to train on data prior to 1990, and will look to forecast the next five years.

japan_exports <- global_economy |> filter(Country=='Japan') |> select(Exports) 

japan_train <- japan_exports |> filter(Year < 1990)
japan_exports_fit <- japan_train |> model(
  ANN = ETS(Exports ~ error('A') + trend('N') + season('N')),
  AAN = ETS(Exports ~ error('A') + trend('A') + season('N'))
  )

japan_exports_fit |> accuracy()
# A tibble: 2 × 10
  .model .type         ME  RMSE   MAE    MPE  MAPE  MASE RMSSE     ACF1
  <chr>  <chr>      <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>    <dbl>
1 ANN    Training -0.0205  1.17 0.867 -0.713  7.67 0.967 0.983 0.000416
2 AAN    Training -0.0105  1.19 0.893 -0.650  7.93 0.995 0.998 0.0184  

The ANN model has a slightly lower RMSE, suggesting it could be the better model, though the results are close.

japan_exports_fit |> forecast(h = 5) |> autoplot(japan_exports)
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

japan_exports_fit |> glance()
# A tibble: 2 × 9
  .model sigma2 log_lik   AIC  AICc   BIC   MSE  AMSE   MAE
  <chr>   <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ANN      1.47   -55.7  117.  118.  122.  1.37  2.61 0.867
2 AAN      1.63   -56.2  122.  125.  129.  1.41  2.55 0.893

Based on the RMSE on the training data as well as the AIC for the fit, the ANN model seems to be the better option as it has lower RMSE and lower AIC. However, when comparing the forecasts for both methods, neither model came close to the real observations and both have very wide confidence intervals, meaning there is high uncertainty and the point forecast needs to be interpreted in that context.

The ANN method (simple exponential smoothing) is best suited for data with no clear trend or seasonal pattern. The AAN method incorporates a trend element. Given the data set seems to have longer term trends, the AAN method would seem to make more sense; however the results showed that it actually had a slightly worse fit than the ANN model.

Additionally, it should be noted that it is possible that a Box-Cox transformation could have fixed the non-constant variance and resulted in better models/forecasts.

F –

Below are the 95% confidence intervals generated by the model for the first year.

japan_exports_fit |> forecast(h = 5) |> mutate(ci = hilo(Exports, 95)) |> filter(Year == 1990)
# A fable: 2 x 5 [1Y]
# Key:     .model [2]
  .model  Year    Exports .mean                     ci
  <chr>  <dbl>     <dist> <dbl>                 <hilo>
1 ANN     1990 N(10, 1.5)  10.1 [7.738164, 12.48526]95
2 AAN     1990 N(10, 1.6)  10.1 [7.603421, 12.60323]95

The manually calculated confidence interval using RMSE are below:

japan_exports_forecast <- japan_exports_fit |> forecast(h = 5) 

#ANN <- 1.169952    
#AAN <- 1.187411

upper_95_japan_exports_ANN <- japan_exports_forecast[1,4] + (1.96 * 1.169952)
lower_95_japan_exports_ANN <- japan_exports_forecast[1,4] - (1.96 * 1.169952)

upper_95_japan_exports_AAN <- japan_exports_forecast[1,4] + (1.96 * 1.187411)
upper_95_japan_exports_AAN <- japan_exports_forecast[1,4] - (1.96 * 1.187411)

print(c(upper_95_japan_exports_ANN, lower_95_japan_exports_ANN,
        upper_95_japan_exports_AAN, upper_95_japan_exports_AAN))
$.mean
[1] 12.40482

$.mean
[1] 7.818605

$.mean
[1] 7.784385

$.mean
[1] 7.784385

These are close to the R generated forecasts (7.738, 12,485) for the ANN model and (7.603, 12.603) for the AAN model.

8.6 - China GDP data

First, I plot the data. I will divide the GDP by $1,000,000,000 to make the scale more manageable.

china_gdp <- global_economy |> filter(Country=='China') |> select(GDP)
china_gdp$GDP <- china_gdp$GDP/1000000000
china_gdp |> autoplot()
Plot variable not specified, automatically selected `.vars = GDP`

First, I try a few different forecast approaches without any further transformation of the data.

china_gdp_fit <- china_gdp |>
  model(ANN = ETS(GDP ~ error("A") + trend("N") + season("N")),
        AAN = ETS(GDP ~ error("A") + trend("A") + season("N")),
        AAdN = ETS(GDP ~ error("A") + trend("Ad", phi = 0.9) + season("N")))

china_gdp_forecast <- china_gdp_fit |> forecast(h = 15)

china_gdp_forecast |> autoplot(china_gdp, level = NULL) +
  guides(color = guide_legend(title = 'Forecast method'))

The AAN method incorporates trend and continues an upward trend that is visible in the data. The AAdN method (dampened trend) moderates the trend component such that the forecast starts to plateau. The ANN method uses the last value as the forecast (the alpha is equal to 1).

Next, I try a box-cox transformation of the data.

lambda <- china_gdp |> features(GDP, features = guerrero) |> pull(lambda_guerrero)

china_gdp |>  autoplot(box_cox(GDP, lambda)) + ggtitle(paste('Box-Cox transformation with lambda',lambda))

Next, I try the same forecasting methods on the transformed data set

china_gdp_boxcox <- china_gdp
china_gdp_boxcox$Box_Cox <- box_cox(china_gdp_boxcox$GDP, lambda)
china_gdp_boxcox$GDP <- NULL


china_gdp_boxcox_fit <- china_gdp_boxcox |>
  model(ANN = ETS(Box_Cox ~ error("A") + trend("N") + season("N")),
        AAN = ETS(Box_Cox ~ error("A") + trend("A") + season("N")),
        AAdN = ETS(Box_Cox ~ error("A") + trend("Ad", phi = 0.9) + season("N")))

china_gdp_boxcox_forecast <- china_gdp_boxcox_fit |> forecast(h = 15)

china_gdp_boxcox_forecast |> autoplot(china_gdp_boxcox, level = NULL) +
  guides(color = guide_legend(title = 'Forecast method'))

There is a similar outcome as with the original data.

8.7 - Gas from aus_production

First, I plot the data

aus_gas <- aus_production |> select(Gas)
aus_gas |> autoplot()
Plot variable not specified, automatically selected `.vars = Gas`

We can see that the time series has positive trend and seasonality. Furthermore, the variance of the series increases as the level of the series increases. Thus, a multiplicative seasonal method is appropriate as the seasonality gets larger as the level of the series increases.

Below I fit a multiplicative model with a trend and seasonal component as well as a dampened trend model with a seasonal component.

aus_gas_fit <- aus_gas |>
  model(
     multiplicative = ETS(Gas ~ error("M") + trend("A") + season("M")),
     dampended = ETS(Gas ~ error("M") + trend("Ad", phi = 0.9) + season("M"))
  )

aus_gas_forecast <- aus_gas_fit |> forecast(h = '3 years')

aus_gas_forecast |> autoplot(aus_gas, level = NULL) + 
    guides(color = guide_legend(title = "Forecast"))

The dampened model is lower for all forecasts except for the first one:

aus_gas_forecast |> as_tibble() |> ggplot(aes(x= Quarter, y = .mean, fill = .model)) + geom_bar(stat = 'identity', position = 'dodge')

8.8

For this question, I will select ‘Liquor retailing’ for ‘New South Wales’

aus_liquor_NSW <- aus_retail |> filter(Industry=='Liquor retailing', State=='New South Wales') 

aus_liquor_NSW |> autoplot(Turnover)

A

Similar to the question above, multiplicative seasonality is necessary because the amount of seasonality increases with the level of the time series.

B - Holt-Winter’s method

First, I create a training set that includes data through the end of 2010

aus_liquor_NSW_train <- aus_liquor_NSW |> filter(year(Month)<2011)

Next, I fit two models on the training data:

aus_liquor_NSW_fit <- aus_liquor_NSW_train |> model(
  multiplicative = ETS(Turnover ~ error("M") + trend("A") + season("M")),
  dampended = ETS(Turnover ~ error("M") + trend("Ad") + season("M"))
  )

aus_liquor_NSW_forecast <- aus_liquor_NSW_fit |>
  forecast(h=96)

aus_liquor_NSW_forecast |> autoplot(aus_liquor_NSW, level = NULL) +
  guides(colour = guide_legend(title = "Forecast"))

C - comparing RMSE

accuracy(aus_liquor_NSW_forecast, aus_liquor_NSW)
# A tibble: 2 × 12
  .model    State Industry .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
  <chr>     <chr> <chr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 dampended New … Liquor … Test  16.4   22.4  17.2  5.37  5.70  1.93  1.83 0.441
2 multipli… New … Liquor … Test  -9.42  17.0  13.6 -3.31  4.85  1.53  1.38 0.668

Holt Winters’ multiplicative method without a dampened trend has a lower RMSE and seems to fit the data better. The dampened method moderates the trend, whereas the actual values continue to have a positive trend without moderation.

D - checking residuals

Below are the residual diagnostics for the multiplicative model:

aus_liquor_NSW_fit |> select(multiplicative) |> gg_tsresiduals()

Below are the diagnostics for the dampened trend model:

aus_liquor_NSW_fit |> select(dampended) |> gg_tsresiduals()

Both models have forecasts that seem to be centered around zero and there is no discernible patter in the residuals (variance is somewhat constant). However, both models have significant autocorrelation at several lags.

Below I conduct the Ljung Box test. The null hypothesis is that the residuals are white noise. I set the lag to lag = 2*m, where m is the seasonal period (24).

Multiplicative method:

aus_liquor_NSW_fit |> select(multiplicative) |> augment() |> features(.innov, ljung_box, lag = 24)
# A tibble: 1 × 3
  .model         lb_stat  lb_pvalue
  <chr>            <dbl>      <dbl>
1 multiplicative    68.7 0.00000338

Dampened method:

aus_liquor_NSW_fit |> select(dampended) |> augment() |> features(.innov, ljung_box, lag = 24)
# A tibble: 1 × 3
  .model    lb_stat lb_pvalue
  <chr>       <dbl>     <dbl>
1 dampended    65.5 0.0000101

Both methods have a p-value less than 0.05, meaning we reject the null hypothesis that the residuals are white noise.

E - test set RMSE

First, I create the test set.

aus_liquor_NSW_test <- aus_liquor_NSW |> filter(year(Month)>2010)

We want to compare the models to the seasonal naive approach. I make a new fit variable that also includes the SNAIVE method:

aus_liquor_NSW_fit_SNAIVE <- aus_liquor_NSW_train |> model(
  multiplicative = ETS(Turnover ~ error("M") + trend("A") + season("M")),
  dampended = ETS(Turnover ~ error("M") + trend("Ad") + season("M")),
  snaive = SNAIVE(Turnover)
  )

Next, I forecast on the test set:

aus_liquor_NSW_forecast_SNAIVE <- aus_liquor_NSW_fit_SNAIVE |>
  forecast(aus_liquor_NSW_test)
aus_liquor_NSW_forecast_SNAIVE |> autoplot(aus_liquor_NSW_test, level = NULL) +  guides(colour = guide_legend(title = "Forecast"))

Last, I check the RMSE:

accuracy(aus_liquor_NSW_forecast_SNAIVE, aus_liquor_NSW_test)
# A tibble: 3 × 12
  .model    State Industry .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
  <chr>     <chr> <chr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 dampended New … Liquor … Test  16.4   22.4  17.2  5.37  5.70   NaN   NaN 0.441
2 multipli… New … Liquor … Test  -9.42  17.0  13.6 -3.31  4.85   NaN   NaN 0.668
3 snaive    New … Liquor … Test  47.6   51.3  47.6 16.8  16.8    NaN   NaN 0.741

The multiplicative method has the lowest RMSE and beats out the SNAIVE approach.

8.9

We are asked to try an STL decomposition on the Box-Cox transformed version of the time series above and then use an ETS model on the seasonally adjusted data.

First, I start with the Box-Cox transformation:

lambda_aus_liquor <- aus_liquor_NSW |> features(Turnover, features = guerrero) |> pull(lambda_guerrero)

aus_liquor_NSW |>  autoplot(box_cox(Turnover, lambda_aus_liquor)) + ggtitle(paste('Box-Cox transformation with lambda',lambda_aus_liquor))

Next, I add the Box-Cox transformed data to the data set.

aus_liquor_NSW$box_cox <- box_cox(aus_liquor_NSW$Turnover, lambda_aus_liquor)

I then re-create the train and test sets:

#aus_liquor_NSW_decomposed_train <- aus_liquor_NSW_decomposed |> filter(year(Month)<2011)
#aus_liquor_NSW_decomposed_test <- aus_liquor_NSW_decomposed |> filter(year(Month)>2010)

aus_liquor_NSW_train <- aus_liquor_NSW |> filter(year(Month)<2011)
aus_liquor_NSW_test <- aus_liquor_NSW |> filter(year(Month)>2010)

Next, I use the decomposition_model function to first decompose the Box-Cox transformed data with an STL decomposition and then fit an ETS model

aus_liquor_NSW_dcmp_fit <- aus_liquor_NSW_train |>
  model(ets_decomp_transformed = decomposition_model(
    STL(box_cox ~ trend() + season(), robust = TRUE),
    ETS(season_adjust)
  ))

Lastly, I forecast on the test set:

aus_liquor_NSW_dcmp_forecast <- aus_liquor_NSW_dcmp_fit |> forecast(aus_liquor_NSW_test)

aus_liquor_NSW_dcmp_fit |> forecast(aus_liquor_NSW_test)|>
  autoplot(aus_liquor_NSW_test, level = NULL) + guides(color = guide_legend(title = "Forecast"))

accuracy(aus_liquor_NSW_dcmp_forecast, aus_liquor_NSW_test)
# A tibble: 1 × 12
  .model State Industry .type      ME  RMSE    MAE   MPE  MAPE  MASE RMSSE  ACF1
  <chr>  <chr> <chr>    <chr>   <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ets_d… New … Liquor … Test  -0.0937 0.120 0.0994 -1.82  1.93   NaN   NaN 0.899

The transformed, decomposed ETS model has a MAPE of 1.93 whereas the multiplicative model on the non-transformed, non-decomposed data has a MAPE of 4.8. This suggests that the transformed, decomposed ETS model is a better fit since it has a lower MAPE.