Homework 5

8.1) Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and ℓ0, and generate forecasts for the next four months.

## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3221247 
## 
##   Initial states:
##      l[0]
##  100646.6
## 
##   sigma^2:  87480760
## 
##      AIC     AICc      BIC 
## 13737.10 13737.14 13750.07

## # A fable: 4 x 6 [1M]
## # Key:     Animal, State, .model [1]
##   Animal State    .model                          Month             Count  .mean
##   <fct>  <fct>    <chr>                           <mth>            <dist>  <dbl>
## 1 Pigs   Victoria "ETS(Count ~ error(\"A\") +… 2019 Jan N(95187, 8.7e+07) 95187.
## 2 Pigs   Victoria "ETS(Count ~ error(\"A\") +… 2019 Feb N(95187, 9.7e+07) 95187.
## 3 Pigs   Victoria "ETS(Count ~ error(\"A\") +… 2019 Mar N(95187, 1.1e+08) 95187.
## 4 Pigs   Victoria "ETS(Count ~ error(\"A\") +… 2019 Apr N(95187, 1.1e+08) 95187.

Compute a 95% prediction interval for the first forecast using y±1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

## <hilo[1]>
## [1] [76854.79, 113518.3]95

8.5) Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

Plot the Exports series and discuss the main features of the data.

There is no seasonality to the data and has no trend. From 1970 to 1980, Iraq gradually increased in exports, then drastically declined close to 0 from 1993 to 1996 due to the recession. The exports drastically increased from 1997 to 1998, where it began its steady decline.

Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

Compute the RMSE values for the training data.

## [1] 12.31977

Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

The ETS(A,A,N) model has a lower RMSE, which indicates the model forecast better than ETS(A,N,N).

## [1] 12.31977

## [1] 12.38392

Compare the forecasts from both methods. Which do you think is best?

I believe the ETS(A,A,N) forecast is best. The ETS(A,A,N) model attempted to follow a trend represented by the data whereas the ETS(A,N,N) model states the forecast will remain stagnant.

Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

## <hilo[1]>
## [1] [13.99644, 63.35056]95

## # A tibble: 1 × 2
##   .pred_lower .pred_upper
##         <dbl>       <dbl>
## 1        10.6        14.8

## <hilo[1]>
## [1] [13.6203, 64.37202]95

## # A tibble: 1 × 2
##   .pred_lower .pred_upper
##         <dbl>       <dbl>
## 1        10.6        14.9

8.6) Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

Simple Exponential Smoothing depicts the trend to remain stagnant. Holt’s method shows an increasing trend, similar to that of Damped Box-Cox, and Damped Log. It seems These methods have similar characteristics within the models. The Damped Holt’s method attempted to follow the flow in the trend by leveling off overtime. The Box-Cox and Log models seem to be increasing exponentially.

8.7) Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

Multiplicative seasonality is necessary due to the variation of the seasonal pattern. With the trend increasing, the seasonal amplitude increases. A high amount of quarters predicted shows additive plateau at a certain point, whereas multiplicative propogates at a steady rate. From the RMSE, dampled multiplicative would be selected over multiplicative due to the slight less difference. This states the forecast was improved a little.

## # A tibble: 4 × 10
##   .model             .type       ME  RMSE   MAE    MPE  MAPE  MASE RMSSE    ACF1
##   <chr>              <chr>    <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
## 1 additive           Trai…  0.00525  4.76  3.35 -4.69  10.9  0.600 0.628  0.0772
## 2 multiplicative     Trai… -0.115    4.60  3.02  0.199  4.08 0.542 0.606 -0.0131
## 3 damped_additive    Trai…  0.967    4.56  3.12  1.41   5.87 0.560 0.602  0.0393
## 4 damped_multiplica… Trai…  0.435    4.56  3.04  0.892  4.18 0.545 0.601 -0.0387

8.8) Recall your retail time series data (from Exercise 7 in Section 2.10).

Why is multiplicative seasonality necessary for this series?

Multiplicative seasonality is necessary due to the variation of the seasonality. The seasonality amplitude increases as the trend increases overtime.

Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

The multiplicative method has a RMSE less than that of damped multiplicative. Therefore, the multiplicative method would be chosen over the other methods.

## # A tibble: 2 × 12
##   State       Industry .model .type      ME  RMSE   MAE    MPE  MAPE  MASE RMSSE
##   <chr>       <chr>    <chr>  <chr>   <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>
## 1 Northern T… Clothin… multi… Trai… -0.0128 0.613 0.450 -0.469  5.15 0.513 0.529
## 2 Northern T… Clothin… dampe… Trai…  0.0495 0.619 0.452  0.303  5.18 0.516 0.534
## # ℹ 1 more variable: ACF1 <dbl>

Check that the residuals from the best method look like white noise.

## # A tibble: 1 × 5
##   State              Industry                           .model bp_stat bp_pvalue
##   <chr>              <chr>                              <chr>    <dbl>     <dbl>
## 1 Northern Territory Clothing, footwear and personal a… multi…    10.7     0.380

## # A tibble: 1 × 5
##   State              Industry                           .model lb_stat lb_pvalue
##   <chr>              <chr>                              <chr>    <dbl>     <dbl>
## 1 Northern Territory Clothing, footwear and personal a… multi…    11.0     0.359

Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

The RMSE of the multiplicative method is significantly lower than the SNAIVE model. Therefore, the multiplicative method is more appropriate.

## Joining with `by = join_by(State, Industry, `Series ID`, Month, Turnover)`

## # A tibble: 2 × 12
##   State       Industry .model .type      ME  RMSE   MAE    MPE  MAPE  MASE RMSSE
##   <chr>       <chr>    <chr>  <chr>   <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>
## 1 Northern T… Clothin… multi… Trai… -0.0119 0.518 0.384 -0.446  5.21 0.420 0.427
## 2 Northern T… Clothin… SNAIVE Trai…  0.439  1.21  0.915  5.23  12.4  1     1    
## # ℹ 1 more variable: ACF1 <dbl>

8.9) For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

The multiplicative and SNAIVE forecast have a RMSE of 0.518 and 1.214, respectively. Using the ETS model on the STL decomposition on Box-Cox transformed data yielded a RMSE of 0.079, much lower than the previous metrics.

## # A tibble: 1 × 5
##   State              Industry                           .model bp_stat bp_pvalue
##   <chr>              <chr>                              <chr>    <dbl>     <dbl>
## 1 Northern Territory Clothing, footwear and personal a… multi…    12.8     0.234

## # A tibble: 1 × 5
##   State              Industry                           .model lb_stat lb_pvalue
##   <chr>              <chr>                              <chr>    <dbl>     <dbl>
## 1 Northern Territory Clothing, footwear and personal a… multi…    13.1     0.217

## Joining with `by = join_by(State, Industry, `Series ID`, Month, Turnover)`

## # A tibble: 1 × 12
##   State    Industry .model .type       ME   RMSE    MAE    MPE  MAPE  MASE RMSSE
##   <chr>    <chr>    <chr>  <chr>    <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl>
## 1 Norther… Clothin… "ETS(… Trai… -0.00401 0.0789 0.0618 -0.267  3.13 0.405 0.388
## # ℹ 1 more variable: ACF1 <dbl>