Homework 5

Questions

Question 1

a.

victoria <- aus_livestock %>%
  filter(State == "Victoria", Animal == "Pigs")

victoria_fit <- victoria %>%
  model(ETS(Count ~ error("A") + trend("N") + season("N"))) 

victoria_fit %>%
  report()

## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3221247 
## 
##   Initial states:
##      l[0]
##  100646.6
## 
##   sigma^2:  87480760
## 
##      AIC     AICc      BIC 
## 13737.10 13737.14 13750.07

victoria_fit %>%
  components(victoria_fit) %>%
  autoplot()

## Warning: Removed 1 row(s) containing missing values (geom_path).

victoria_forecast <- victoria_fit %>%
  forecast(h = 4) 

victoria_plot <- victoria %>%
  filter(year(Month) > 2009)

victoria_forecast %>%
  autoplot(victoria_plot)

b.

victoria_mean <- victoria_forecast %>%
  select(Animal, State, Month, Count, .mean)

victoria_mean_pi <- victoria_mean %>%
  mutate(pi_lower = .mean - 1.96 * 9353.11) %>%
  mutate(pi_upper = .mean + 1.96 * 9353.11) 
  
victoria_mean_pi

## # A fable: 4 x 7 [1M]
## # Key:     Animal, State [1]
##   Animal State       Month             Count  .mean pi_lower pi_upper
##   <fct>  <fct>       <mth>            <dist>  <dbl>    <dbl>    <dbl>
## 1 Pigs   Victoria 2019 Jan N(95187, 8.7e+07) 95187.   76854.  113519.
## 2 Pigs   Victoria 2019 Feb N(95187, 9.7e+07) 95187.   76854.  113519.
## 3 Pigs   Victoria 2019 Mar N(95187, 1.1e+08) 95187.   76854.  113519.
## 4 Pigs   Victoria 2019 Apr N(95187, 1.1e+08) 95187.   76854.  113519.

victoria_forecast %>%
  hilo() %>%
  select(Animal, State, Month, Count, .mean, '95%')

## # A tsibble: 4 x 6 [1M]
## # Key:       Animal, State [1]
##   Animal State       Month             Count  .mean                  `95%`
##   <fct>  <fct>       <mth>            <dist>  <dbl>                 <hilo>
## 1 Pigs   Victoria 2019 Jan N(95187, 8.7e+07) 95187. [76854.79, 113518.3]95
## 2 Pigs   Victoria 2019 Feb N(95187, 9.7e+07) 95187. [75927.17, 114445.9]95
## 3 Pigs   Victoria 2019 Mar N(95187, 1.1e+08) 95187. [75042.22, 115330.9]95
## 4 Pigs   Victoria 2019 Apr N(95187, 1.1e+08) 95187. [74194.54, 116178.6]95

When computing the Prediction Interval manually, I found the upper and lower values to be the same because I used the naive method. Because the formula is the same for each prediction interval, this resulted in the same upper and lower value. This did not occur for the automatic prediction interval. The prediction interval given by the forecast accounts for the fact that the forecast is likely to be less and less accurate as the value is forecasted further and further into the future.

Question 2

a.

sweden <- global_economy %>%
  filter(Country == "Sweden") %>%
  select(Year, Exports)

sweden_train <- sweden %>%
  filter(Year < 2000)

sweden_test <- sweden %>%
  filter(Year > 2000)

sweden %>%
  ggplot(aes(x = Year, y = Exports)) +
  geom_line()

b.

sweden_forecast_ann <- sweden %>%
  model(ETS(Exports ~ error("A") + trend("N") + season("N"))) %>%
  forecast(h = 10) 
sweden_forecast_ann %>%
  autoplot(sweden)

c.

sweden_forecast_train_ann <- sweden_train %>%
  model(ETS(Exports ~ error("A") + trend("N") + season("N"))) %>%
  forecast(h = 10)

sweden_forecast_ann <- sweden %>%
  model(ETS(Exports ~ error("A") + trend("N") + season("N"))) %>%
  forecast(h = 10)

sweden_forecast_ann %>%
  autoplot(sweden)

accuracy(sweden_forecast_train_ann, sweden_test, list(rmse = RMSE, mae = MAE, mape = MAPE))

## Warning: The future dataset is incomplete, incomplete out-of-sample data will be treated as missing. 
## 1 observation is missing at 2000

## # A tibble: 1 × 5
##   .model                                                 .type  rmse   mae  mape
##   <chr>                                                  <chr> <dbl> <dbl> <dbl>
## 1 "ETS(Exports ~ error(\"A\") + trend(\"N\") + season(\… Test   5.05  4.20  8.93

d.

sweden_forecast_aan <- sweden %>%
  model(ETS(Exports ~ error("A") + trend("A") + season("N"))) %>%
  forecast(h = 10)
sweden_forecast_aan %>%
  autoplot(sweden)

The reason for doing the simple exponential smoothing method is that the data appears to have some randomness similar to a random-walk. The simple exponential smoothing method is good for short term forecasts because the next value is random but similar to the last observed value. THe reason for doing the holt linear model is because in addition to having some randomness, the data has a clear upward trend. The holt linear model accounts for this upward trend. This will be better for further ahead forecasts, assuming you believe the trend will continue.

e.

I believe the holt linear method is better for this data because it accounts for the upward trend. The linear exponential smoothing model fails to account for this trend. I think for a long term forecast such as 10 years as we do here. The data is likely to continue to follow the trend with some randomness around the trend.

Question 3

china <- global_economy %>%
  filter(Country == "China") %>%
  select(Year, GDP)

china %>%
  autoplot()

## Plot variable not specified, automatically selected `.vars = GDP`

china_guerro <- china %>%
  features(GDP, features = guerrero)

china_boxcox <- china %>%
  autoplot(box_cox(GDP, china_guerro)) +
  labs(y = "China's Box-Cox Transformed GDP")
china_boxcox

china_compare <- china %>%
  model(
    'Auto' = ETS(GDP),
    'Simple' = ETS(GDP ~ error("A") + trend("N") + season("N")),
    'Holt-Winters' = ETS(GDP ~ error("A") + trend("A") + season("N")),
    'Damped Holt-Winters' = ETS(GDP ~ error("A") + trend("Ad") + season("N"))
  )

china_compare %>%
  forecast(h = 20) %>%
  autoplot(china)

I believe for this data the damped holt-winters is the best for forecasting 20 years into the future. GDP is unlikely to continue increasing at the same rate it has been for China. The damped method will allow me to forecast using the trend, account for randomness around the trend, and predict the decrease in growth rate I believe will occur over the 20 years.

Question 4

gas <- aus_production %>%
  select(Quarter, Gas)

gas %>%
  autoplot()

## Plot variable not specified, automatically selected `.vars = Gas`

gas_check <- gas %>%
  model(ets = ETS(Gas),
        'Multi H-W' = ETS(Gas ~ error("A") + trend("A") + season("M")),
        'Damped Multi H-W' = ETS(Gas ~ error("A") + trend("Ad") + season("M")))

gas_check %>%
  glance()

## # A tibble: 3 × 9
##   .model             sigma2 log_lik   AIC  AICc   BIC   MSE  AMSE    MAE
##   <chr>               <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 ets               0.00324   -831. 1681. 1682. 1711.  21.1  32.2 0.0413
## 2 Multi H-W        18.2       -899. 1816. 1817. 1847.  17.6  25.1 2.84  
## 3 Damped Multi H-W 18.5       -901. 1821. 1822. 1855.  17.8  25.9 2.81

gas_auto <- gas %>%
  model(ets = ETS(Gas))
        
gas_auto %>%
  report()

## Series: Gas 
## Model: ETS(M,A,M) 
##   Smoothing parameters:
##     alpha = 0.6528545 
##     beta  = 0.1441675 
##     gamma = 0.09784922 
## 
##   Initial states:
##      l[0]       b[0]      s[0]    s[-1]    s[-2]     s[-3]
##  5.945592 0.07062881 0.9309236 1.177883 1.074851 0.8163427
## 
##   sigma^2:  0.0032
## 
##      AIC     AICc      BIC 
## 1680.929 1681.794 1711.389

Multiplicative seasonality is absolutely necessary for this data because the seasonality is proportionate to the level of the forecast. The variance at lower levels before 1980 is almost unnoticable compared to the variance after 1980. I believe the multiplicative Holt-Winters is the best method for forecasting this data because there is a trend in the data that does not appear to be decreasing. This is why I do not like the damped method for this data. I also believe the multiplicative error for this data does not make sense for this data. The multiplicative Holt-Winters also gave me the lowest AICc of the models I checked outside of the automatic ETS function.

Question 5

a.

Multiplicative seasonality is absolutely necessary for this data because the seasonality is proportionate to the level of the forecast. Before 2000, the level of the data is much lower and thus the variance is lower compared to the variance after 2000. The magnitude of variance continues to increase overtime with the level of Turnovers in Australia. All of this means the data is heteroskedatic so I need multiplicative seasonality.

b.

set.seed(87654321)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`, 1))
myseries %>% 
  autoplot(Turnover)

myseries_train <- myseries %>%
  filter(year(Month) < 2011)
myseries_test <- myseries %>%
  filter(year(Month) > 2010)

myseries_aam <- myseries %>%
  model(ETS(Turnover ~ error("A") + trend("A") + season("M"))) 

myseries_aam %>%
  glance()

## # A tibble: 1 × 11
##   State    Industry    .model sigma2 log_lik   AIC  AICc   BIC   MSE  AMSE   MAE
##   <chr>    <chr>       <chr>   <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Victoria Footwear a… "ETS(…   49.5  -2195. 4424. 4425. 4493.  47.7  65.2  4.55

myseries_aam %>%
  forecast(h = 36) %>%
  autoplot(myseries)

myseries_aadm <- myseries %>%
  model(ETS(Turnover ~ error("A") + trend("Ad") + season("M"))) 

myseries_aadm %>%
  glance()

## # A tibble: 1 × 11
##   State    Industry    .model sigma2 log_lik   AIC  AICc   BIC   MSE  AMSE   MAE
##   <chr>    <chr>       <chr>   <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Victoria Footwear a… "ETS(…   50.1  -2197. 4430. 4432. 4504.  48.2  66.7  4.52

myseries_aadm %>%
  forecast(h = 36) %>%
  autoplot(myseries)

c.

accuracy(myseries_aam)

## # A tibble: 1 × 12
##   State  Indus…¹ .model .type      ME  RMSE   MAE    MPE  MAPE  MASE RMSSE  ACF1
##   <chr>  <chr>   <chr>  <chr>   <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Victo… Footwe… "ETS(… Trai… -0.0397  6.91  4.55 -0.605  4.55 0.537 0.589 0.107
## # … with abbreviated variable name ¹Industry

accuracy(myseries_aadm)

## # A tibble: 1 × 12
##   State    Indus…¹ .model .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1
##   <chr>    <chr>   <chr>  <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 Victoria Footwe… "ETS(… Trai… 0.605  6.94  4.52 0.373  4.41 0.534 0.593 0.0687
## # … with abbreviated variable name ¹Industry

The RMSE of the Multiplicative Holt-Winters’ method is lower. This means this model is more accurate for the data. The RMSE of these two methods are very similar which is not surprising. This is because the methods are very similar. The only difference is whether the trend is damped or not. I prefer the Multiplicative Holt-Winters’ method

d.

myseries_aam %>%
  gg_tsresiduals()

augment(myseries_aam) %>%
  features(.resid, ljung_box, lag = 24, dof = 0)

## # A tibble: 1 × 5
##   State    Industry                                       .model lb_stat lb_pv…¹
##   <chr>    <chr>                                          <chr>    <dbl>   <dbl>
## 1 Victoria Footwear and other personal accessory retaili… "ETS(…    75.8 2.85e-7
## # … with abbreviated variable name ¹lb_pvalue

I believe the residuals look very good except that there is heteroskedasticity. The residuals do not have much autocorrelation, the mean is 0, and they are normally distributed. The heteroskedasticity in the residual is expected because the entire dataset suffers from heteroskedasticity. The ljung-box test came back and the p-value is below 0.05. This allows me to reject the null hypothesis. This means the residuals are not differentable from white-noise. This is great for the Multiplicative Holt-Winters’ method.

e.

myseries_aam_train <- myseries_train %>%
  model(ETS(Turnover ~ error("A") + trend("A") + season("M"))) %>%
  forecast(h = 36)

myseries_snaive_train <- myseries_train %>%
  model(Seasonal_naive = SNAIVE(Turnover)) %>%
  forecast(h = 36)

accuracy(myseries_aam_train, myseries_test, list(rmse = RMSE, mae = MAE, mape = MAPE))

## # A tibble: 1 × 7
##   .model                                   State Indus…¹ .type  rmse   mae  mape
##   <chr>                                    <chr> <chr>   <chr> <dbl> <dbl> <dbl>
## 1 "ETS(Turnover ~ error(\"A\") + trend(\"… Vict… Footwe… Test   19.3  15.3  8.82
## # … with abbreviated variable name ¹Industry

accuracy(myseries_snaive_train, myseries_test, list(rmse = RMSE, mae = MAE, mape = MAPE))

## # A tibble: 1 × 7
##   .model         State    Industry                       .type  rmse   mae  mape
##   <chr>          <chr>    <chr>                          <chr> <dbl> <dbl> <dbl>
## 1 Seasonal_naive Victoria Footwear and other personal a… Test   16.3  14.1  8.24

Using the accuracy() function I found the RMSE of both the Multiplicative Holt-Winters’ method and the SNAIVE method. Based off of RMSE, I did not beat the SNAIVE method. The RMSEs are very similar, but the SNAIVE has a lower RMSE, so the Multiplicative Holt-Winters’ method is not as accurate at forecasting the data as the SNAIVE.

Question 6

a.

trips <- tourism %>%
  summarize(Trips = sum(Trips))

autoplot(trips)

## Plot variable not specified, automatically selected `.vars = Trips`

The level of Trips taken till 2010 are very similar. There is variance from the mean, but there is roughly no trend through 2010. This period is also pretty homoskedastic. After 2010, there is a steep increasing trend. The magnitude of the variance is still roughly constant giving me homoskedasticity throughout my data.

b.

trips_stl <- trips %>%
  model(STL(Trips ~ season(window = 4), robust = TRUE))

components(trips_stl) %>%
  autoplot() +
  labs(title = "STL Decomposition of Australian Trips")

trips %>%
  autoplot(Trips) +
  autolayer(components(trips_stl), trend, color = 'red') +
  labs(title = "Seasonally Adjusted Australian Trips", y = "Total Trips", x = "Quarter")

c.

seasonadj_trips <- trips %>%
  model(decomposition_model(STL(Trips),
  ETS(season_adjust ~ error("A") + trend("Ad") + season("N")))) 

seasonadj_trips %>%
  forecast(h = 8) %>%
  autoplot(trips)

d.

trips_aan <- trips %>%
  model(ETS(Trips ~ error("A") + trend("A") + season("N"))) 

trips_aan %>%
  forecast(h = 2)

## # A fable: 2 x 4 [1Q]
## # Key:     .model [1]
##   .model                                        Quarter             Trips  .mean
##   <chr>                                           <qtr>            <dist>  <dbl>
## 1 "ETS(Trips ~ error(\"A\") + trend(\"A\") + s… 2018 Q1 N(27594, 1559325) 27594.
## 2 "ETS(Trips ~ error(\"A\") + trend(\"A\") + s… 2018 Q2 N(27920, 1652392) 27920.

e.

trips_auto <- trips %>%
  model(ETS(Trips))

trips_auto

## # A mable: 1 x 1
##   `ETS(Trips)`
##        <model>
## 1 <ETS(A,A,A)>

f.

accuracy(trips_auto)

## # A tibble: 1 × 10
##   .model     .type       ME  RMSE   MAE   MPE  MAPE  MASE RMSSE     ACF1
##   <chr>      <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>    <dbl>
## 1 ETS(Trips) Training  105.  794.  604. 0.379  2.86 0.636 0.653 -0.00151

accuracy(trips_aan)

## # A tibble: 1 × 10
##   .model                  .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1
##   <chr>                   <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 "ETS(Trips ~ error(\"A… Trai…  123. 1217. 1002. 0.317  4.67  1.06  1.00 0.0895

accuracy(seasonadj_trips)

## # A tibble: 1 × 10
##   .model                 .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE    ACF1
##   <chr>                  <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>
## 1 "decomposition_model(… Trai…  103.  763.  576. 0.367  2.72 0.607 0.629 -0.0174

The RMSE of the seasonally adjusted data using STL was the lowest. This is surprising to me as the automatic ETS gives the method with the lowest RMSE. I was surprised the STL seasonally adjusted data resulted in a slightly lower RMSE. The Holt Linear method gave the highest RMSE. This is not surprising as there is a trend in the second half of the data which the model does not account for.

g.

trips_compare <- trips %>%
  model(
    'H-W' = ETS(Trips ~ error("A") + trend("A") + season("N")),
    'Auto' = ETS(Trips))

trips_compare %>%
  forecast(h = "5 years") %>%
  autoplot(trips)

seasonadj_trips %>%
  forecast(h = "5 years") %>%
  autoplot(trips) +
  labs(title = "Forecast using STL")

I believe the seasonally adjusted forecast using STL is the most reasonable of these approaches. This is because it follows the upward trend while accounting for the fluctuations around the trend. This makes this fortecast accurate at any specific time unlike the H-W and automatic forecasts. The H-W looks great for following the trend, but it doesn’t forecast any of the variation around the trend line into the future. This difference is very important as there is lots of fluctuation around the trend.

h.

seasonadj_trips %>%
  gg_tsresiduals()

## Warning: Removed 4 row(s) containing missing values (geom_path).

## Warning: Removed 4 rows containing missing values (geom_point).

## Warning: Removed 4 rows containing non-finite values (stat_bin).

augment(seasonadj_trips) %>%
  features(.resid, ljung_box, lag = 8, dof = 0)

## # A tibble: 1 × 3
##   .model                                                         lb_stat lb_pv…¹
##   <chr>                                                            <dbl>   <dbl>
## 1 "decomposition_model(STL(Trips), ETS(season_adjust ~ error(\"…    5.05   0.752
## # … with abbreviated variable name ¹lb_pvalue

Question 7

a.

nz <- aus_arrivals %>%
  filter(Origin == 'NZ')

autoplot(nz) +
  labs(x = "Quarter")

## Plot variable not specified, automatically selected `.vars = Arrivals`

b.

nz_train <- nz %>%
  filter(year(Quarter) < 2011)

nz_test <- nz %>%
  filter(year(Quarter) > 2010)

nz_aam <- nz_train %>%
  model(ETS(Arrivals ~ error("A") + trend("A") + season("M"))) 

nz_aam %>%
  forecast(h = "2 years") %>%
  autoplot(nz)

c.

Multiplicative seasonality is absolutely necessary for this data because the seasonality is proportionate to the level of the forecast. Before 1990, the level of the data is much lower and thus the variance is lower compared to the variance after 1990 The magnitude of variance continues to increase overtime with the level of Arrivals in Australia. This gives me heteroskedasticity in my dataset which makes multiplicative seasonality key in producting a solid forecast.

d.

nz_compare <- nz %>%
  model(
    'ETS' = ETS(Arrivals),
    'ETS Log' = ETS(log(Arrivals)),
    'SNAIVE' = SNAIVE(Arrivals),
    'STL' = decomposition_model(STL(log(Arrivals)),
    ETS(season_adjust ~ error("A") + trend("Ad") + season("A")))
  )

nz_compare %>%
  forecast(h = "2 years") %>%
  autoplot(nz)

e.

accuracy(nz_compare)

## # A tibble: 4 × 11
##   Origin .model  .type       ME   RMSE    MAE    MPE  MAPE  MASE RMSSE    ACF1
##   <chr>  <chr>   <chr>    <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
## 1 NZ     ETS     Training -632. 11528.  9165. -1.00   6.51 0.625 0.607 -0.0677
## 2 NZ     ETS Log Training 2087. 11634.  9103.  1.14   6.37 0.620 0.613 -0.100 
## 3 NZ     SNAIVE  Training 7437. 18983. 14675.  3.82  10.2  1     1      0.605 
## 4 NZ     STL     Training  905. 10022.  7853.  0.675  5.29 0.535 0.528 -0.0581

nz_stl <- nz %>%
  model('STL' = decomposition_model(STL(log(Arrivals)),
    ETS(season_adjust ~ error("A") + trend("Ad") + season("A")))
  )
augment(nz_stl) %>%
  features(.resid, ljung_box, lag = 8, dof = 0)

## # A tibble: 1 × 4
##   Origin .model lb_stat lb_pvalue
##   <chr>  <chr>    <dbl>     <dbl>
## 1 NZ     STL       15.7    0.0465

nz_stl %>%
  gg_tsresiduals()

## Warning: Removed 4 row(s) containing missing values (geom_path).

## Warning: Removed 4 rows containing missing values (geom_point).

## Warning: Removed 4 rows containing non-finite values (stat_bin).

I believe the method using STL decomposition is the best method for this data. This method produces the lowest RMSE of the methods I used including the automatic ETS approach. It is surprising that the STL approach was able to have a lower RMSE than the method that is designed to produce the model with the lowest RMSE and AICc. This method holds up to the residuals tests such as the Ljung-Box test and the residuals resemble white noise. The p-value of the ljung-box test is below 0.05 allowing me to reject the null hypothese. This means my residuals are indiferentiable from white-noise. Plotting th residuals looks good too. There is some heteroskedasticity in the residuals, but they are normally distributed, have low to no correlation, and have a mean of 0.

f.

nz_compare_cv <- nz %>%
  slice(1:(n() - 3)) %>%
  stretch_tsibble(.init = 36, .step = 3)
  
nz_compare_cv %>% 
  model(
    'ETS model' = ETS(Arrivals),
    'Additive to a Log Transformed' = ETS(log(Arrivals) ~ error("A") + trend("A") + season("A")),
    'Seasonal Naïve Method' = SNAIVE(Arrivals),
    'STL' = decomposition_model(STL(log(Arrivals)),
    ETS(season_adjust ~ error("A") + trend("Ad") + season("A")))
    ) %>%
    forecast(h = 3) %>%
    accuracy(nz)

## # A tibble: 4 × 11
##   .model          Origin .type    ME   RMSE    MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>           <chr>  <chr> <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Additive to a … NZ     Test  -375. 14347. 11021. 0.200  5.83 0.741 0.746 0.309
## 2 ETS model       NZ     Test  4627. 15327. 11799. 2.23   6.45 0.793 0.797 0.283
## 3 Seasonal Naïve… NZ     Test  8244. 18768. 14422. 3.83   7.76 0.970 0.976 0.566
## 4 STL             NZ     Test  2342. 15139. 11954. 1.09   6.41 0.804 0.787 0.270

Question 8

a.

cement <- aus_production %>%
  select(Quarter, Cement)

cement_stretch1 <- cement %>%
  slice(1:(n()-4)) %>%
  stretch_tsibble(.init = 20, .step = 1)

cement_compare <- cement_stretch1 %>%
  model(ETS(Cement), 
        SNAIVE(Cement)
  ) %>%
  forecast(h = "1 year")

cement_compare %>%
  hilo()

## # A tsibble: 1,560 x 7 [1Q]
## # Key:       .id, .model [390]
##      .id .model         Quarter       Cement .mean                  `80%`
##    <int> <chr>            <qtr>       <dist> <dbl>                 <hilo>
##  1     1 ETS(Cement)    1961 Q1  N(661, 235)  661. [641.6969, 681.0183]80
##  2     1 ETS(Cement)    1961 Q2  N(752, 305)  752. [729.7443, 774.5230]80
##  3     1 ETS(Cement)    1961 Q3  N(779, 328)  779. [755.7503, 802.1874]80
##  4     1 ETS(Cement)    1961 Q4  N(764, 317)  764. [741.1479, 786.7477]80
##  5     1 SNAIVE(Cement) 1961 Q1 N(621, 2068)  621  [562.7176, 679.2824]80
##  6     1 SNAIVE(Cement) 1961 Q2 N(698, 2068)  698  [639.7176, 756.2824]80
##  7     1 SNAIVE(Cement) 1961 Q3 N(753, 2068)  753  [694.7176, 811.2824]80
##  8     1 SNAIVE(Cement) 1961 Q4 N(728, 2068)  728  [669.7176, 786.2824]80
##  9     2 ETS(Cement)    1961 Q2  N(752, 220)  752. [732.8981, 770.8905]80
## 10     2 ETS(Cement)    1961 Q3  N(775, 220)  775. [755.8170, 793.8094]80
## # … with 1,550 more rows, and 1 more variable: `95%` <hilo>
## # ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable names

b.

cement_stretch2 <- cement %>%
  slice(1:(n()-4)) %>%
  stretch_tsibble(.init = 20, .step = 4)

cement_compare2 <- cement_stretch2 %>%
  model(ETS(Cement), 
        SNAIVE(Cement)) %>%
  forecast(h = "1 year")

cement_mse <- accuracy(cement_compare2, cement) %>%
  mutate(MSE = (RMSE)^2) %>%
  select(MSE, RMSE)
cement_mse

## # A tibble: 2 × 2
##      MSE  RMSE
##    <dbl> <dbl>
## 1 11928.  109.
## 2 19130.  138.

cement_final <- cement_mse %>%
  add_column(Model = c('ETS', 'SNAIVE')) %>%
  select(Model, MSE, RMSE)
cement_final

## # A tibble: 2 × 3
##   Model     MSE  RMSE
##   <chr>   <dbl> <dbl>
## 1 ETS    11928.  109.
## 2 SNAIVE 19130.  138.

The ETS method had the lowest MSE and RMSE of the two methods. This is not surprising because the automatic ETS method is designed to automatically produce the model that has the lowest RMSE, AICc, and MSE. This means that this is the most accurate model of the two I tried (automatice ETS and SNAIVE).