DATA 624 Homework 5

library(fpp3)

## Warning: package 'fpp3' was built under R version 3.6.3

## -- Attaching packages ------------------------------------------------------ fpp3 0.4.0 --

## v tibble      3.1.1     v tsibble     1.1.1
## v dplyr       1.0.6     v tsibbledata 0.4.0
## v tidyr       1.1.3     v feasts      0.2.2
## v lubridate   1.7.4     v fable       0.3.0
## v ggplot2     3.3.5

## Warning: package 'tibble' was built under R version 3.6.3

## Warning: package 'dplyr' was built under R version 3.6.3

## Warning: package 'tidyr' was built under R version 3.6.3

## Warning: package 'fable' was built under R version 3.6.3

## -- Conflicts ----------------------------------------------------------- fpp3_conflicts --
## x lubridate::date()       masks base::date()
## x dplyr::filter()         masks stats::filter()
## x tsibble::intersect()    masks base::intersect()
## x tsibble::interval()     masks lubridate::interval()
## x dplyr::lag()            masks stats::lag()
## x tsibble::new_interval() masks lubridate::new_interval()
## x tsibble::setdiff()      masks base::setdiff()
## x tsibble::union()        masks base::union()

library(tidyverse)

## -- Attaching packages ------------------------------------------------- tidyverse 1.2.1 --

## v readr   1.3.1     v stringr 1.4.0
## v purrr   0.3.2     v forcats 0.4.0

## -- Conflicts ---------------------------------------------------- tidyverse_conflicts() --
## x lubridate::as.difftime() masks base::as.difftime()
## x lubridate::date()        masks base::date()
## x dplyr::filter()          masks stats::filter()
## x tsibble::intersect()     masks lubridate::intersect(), base::intersect()
## x tsibble::interval()      masks lubridate::interval()
## x dplyr::lag()             masks stats::lag()
## x tsibble::new_interval()  masks lubridate::new_interval()
## x tsibble::setdiff()       masks lubridate::setdiff(), base::setdiff()
## x tsibble::union()         masks lubridate::union(), base::union()

Exercise 8.8.1

Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.

a. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of
α and ℓ0, and generate forecasts for the next four months.

Even though I’ve seen these data in previous exercises, I decided to take a look at the variables again in the next few lines of code in order to filter the data with the required variables for this exercise:

head(aus_livestock)

## # A tsibble: 6 x 4 [1M]
## # Key:       Animal, State [1]
##      Month Animal                     State                        Count
##      <mth> <fct>                      <fct>                        <dbl>
## 1 1976 Jul Bulls, bullocks and steers Australian Capital Territory  2300
## 2 1976 Aug Bulls, bullocks and steers Australian Capital Territory  2100
## 3 1976 Sep Bulls, bullocks and steers Australian Capital Territory  2100
## 4 1976 Oct Bulls, bullocks and steers Australian Capital Territory  1900
## 5 1976 Nov Bulls, bullocks and steers Australian Capital Territory  2100
## 6 1976 Dec Bulls, bullocks and steers Australian Capital Territory  1800

aus_livestock %>%
  distinct(Animal)

## # A tibble: 7 x 1
##   Animal                    
##   <fct>                     
## 1 Bulls, bullocks and steers
## 2 Calves                    
## 3 Cattle (excl. calves)     
## 4 Cows and heifers          
## 5 Lambs                     
## 6 Pigs                      
## 7 Sheep

aus_livestock %>%
  distinct(State)

## # A tibble: 8 x 1
##   State                       
##   <fct>                       
## 1 Australian Capital Territory
## 2 New South Wales             
## 3 Northern Territory          
## 4 Queensland                  
## 5 South Australia             
## 6 Tasmania                    
## 7 Victoria                    
## 8 Western Australia

Below, we have an example of what the graph looks like for the number of pigs in Victoria before we apply any modeling techniques:

aus_livestock %>%
  filter(State=="Victoria" & Animal=="Pigs") %>%
  autoplot(Count)+
  labs(y="Count", title="Number of Pigs in Victoria")

We assign the filtered data to variable “pigs”.

#pigs data
pigs<-aus_livestock %>%
  filter(State=="Victoria" & Animal=="Pigs")

I borrowed some of the code from the textbook to build the model using the ETS() function. I have also acquired the optimal values of “alpha” and “l” below. The value of alpha is low, suggesting that more weight is given to observations from the past

#model
fit <- pigs %>%
  model(ETS(Count ~ error("A") + trend("N") + season("N")))
fc <- fit %>%
  forecast(h = 4)

report(fit)

## Series: Count 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.3221247 
## 
##   Initial states:
##         l
##  100646.6
## 
##   sigma^2:  87480760
## 
##      AIC     AICc      BIC 
## 13737.10 13737.14 13750.07

As we can observe below, the simple exponential smoothing function that we applied to model these data follows the original graph one step ahead.

fc %>%
  autoplot(pigs) +
  geom_line(aes(y = .fitted), col="#D55E00",
            data = augment(fit)) +
  labs(y="Count", title="Number of Pigs in Victoria") +
  guides(colour = "none")

b. Compute a 95% prediction interval for the first forecast using \(\hat{y}\pm1.96s\) where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.

We can observe below that the prediction interval for the first observation is [76854.79, 113518.3] generated by R.

#prediction interval produced by R
pre_int = unpack_hilo(hilo(fc, 95) , "95%" )
pre_int

## # A tsibble: 4 x 8 [1M]
## # Key:       Animal, State, .model [1]
##   Animal State    .model    Month             Count  .mean `95%_lower` `95%_upper`
##   <fct>  <fct>    <chr>     <mth>            <dist>  <dbl>       <dbl>       <dbl>
## 1 Pigs   Victoria "ETS(~ 2019 Jan N(95187, 8.7e+07) 95187.      76855.     113518.
## 2 Pigs   Victoria "ETS(~ 2019 Feb N(95187, 9.7e+07) 95187.      75927.     114446.
## 3 Pigs   Victoria "ETS(~ 2019 Mar N(95187, 1.1e+08) 95187.      75042.     115331.
## 4 Pigs   Victoria "ETS(~ 2019 Apr N(95187, 1.1e+08) 95187.      74195.     116179.

Now we will use the mean from the first observation and the variance from our previous report to calculate the prediction interval by hand.

#prediction interval by hand
m <- 95186.56

s <- sqrt(87480760)

low_y <- m - 1.96 * s
up_y <- m + 1.96 * s

paste(low_y, up_y)

## [1] "76854.4546212935 113518.665378707"

We can see that our results by hand and those of R are off by just a few decimals, thus they are practically identical.

Exercise 8.8.5

Data set global_economy contains the annual Exports from many countries. Select one country to analyse.

a. Plot the Exports series and discuss the main features of the data.

I have picked Chile for this time series. We can see an upward trend with 3 big dips throughout the years. We can also observe that this trend starts to go down around the year 2008, but it is unclear whether this is another dip as we don’t have data past 2017. Thus we can’t see if it has gone back up.

#Chile exports
chile_exp <- global_economy %>%
  select("Year", Exports) %>%
  filter(Country == "Chile")
autoplot(chile_exp) +
  labs(y="Exports", title="Chile Exports")

## Plot variable not specified, automatically selected `.vars = Exports`

b. Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.

#model
fit2 <- (chile_exp) %>%
  model(ETS(Exports ~ error("A") + trend("N") + season("N")))
fc2 <- fit2 %>%
  forecast(h = 4)

report(fit2)

## Series: Exports 
## Model: ETS(A,N,N) 
##   Smoothing parameters:
##     alpha = 0.9998997 
## 
##   Initial states:
##         l
##  13.06111
## 
##   sigma^2:  6.7775
## 
##      AIC     AICc      BIC 
## 350.4596 350.9041 356.6410

fc2 %>%
  autoplot(chile_exp) +
  geom_line(aes(y = .fitted), col="#D55E00",
            data = augment(fit2))+
  labs(y="Exports", title="Chile Exports") +
  guides(colour = "none")

c. Compute the RMSE values for the training data.

We can observe from our accuracy() function below that RMSE is 2.55, which considering our range of exports from 9.55 to around 45, this is an indication that the model can relatively predict the data accurately.

#metrics
accuracy(fit2)

## # A tibble: 1 x 11
##   Country .model           .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <fct>   <chr>            <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Chile   "ETS(Exports ~ ~ Trai~ 0.270  2.56  1.99 0.598  8.80 0.983 0.991 0.272

d. Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.

Using the accuracy() function below, we can see that the RMSE is practically identical to the one in the first model. In fact, most of the metrics have stayed about the same, except for ME, which has decreased in this new model as well as MPE, which seems to have inverted from positive to negative indicating a tendency to over-forecast.

#model
fit3 <- (chile_exp) %>%
  model(ETS(Exports ~ error("A") + trend("A") + season("N")))
fc3 <- fit3 %>%
  forecast(h = 4)

accuracy(fit3)

## # A tibble: 1 x 11
##   Country .model        .type      ME  RMSE   MAE    MPE  MAPE  MASE RMSSE  ACF1
##   <fct>   <chr>         <chr>   <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Chile   "ETS(Exports~ Trai~ 0.00450  2.55  2.00 -0.550  9.06 0.989 0.988 0.263

e. Compare the forecasts from both methods. Which do you think is best?

The new model is showing a forecast with an upward trend as opposed to the first model showing a forecast of a straight line from the last observation in the data. Based on RMSE I would say that the second model is slightly better. Moreover, the additional parameter in the second model may be helping with forecasting the trend-cycle component.

fc3 %>%
  autoplot(chile_exp) +
  geom_line(aes(y = .fitted), col="#D55E00",
            data = augment(fit3))+
  labs(y="Exports", title="Chile Exports") +
  guides(colour = "none")

f. Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.

The 95% prediction intervals produced by R for the first forecast for each of the two models are as follow:

#first and second model intervals produced by R
pre_int2 = unpack_hilo(hilo(fc2, 95) , "95%" )
pre_int2

## # A tsibble: 4 x 7 [1Y]
## # Key:       Country, .model [1]
##   Country .model                   Year    Exports .mean `95%_lower` `95%_upper`
##   <fct>   <chr>                   <dbl>     <dist> <dbl>       <dbl>       <dbl>
## 1 Chile   "ETS(Exports ~ error(\~  2018 N(29, 6.8)  28.7        23.6        33.8
## 2 Chile   "ETS(Exports ~ error(\~  2019  N(29, 14)  28.7        21.5        35.9
## 3 Chile   "ETS(Exports ~ error(\~  2020  N(29, 20)  28.7        19.9        37.5
## 4 Chile   "ETS(Exports ~ error(\~  2021  N(29, 27)  28.7        18.5        38.9

pre_int3 = unpack_hilo(hilo(fc3, 95) , "95%" )
pre_int3

## # A tsibble: 4 x 7 [1Y]
## # Key:       Country, .model [1]
##   Country .model                    Year   Exports .mean `95%_lower` `95%_upper`
##   <fct>   <chr>                    <dbl>    <dist> <dbl>       <dbl>       <dbl>
## 1 Chile   "ETS(Exports ~ error(\"~  2018  N(29, 7)  29.0        23.8        34.2
## 2 Chile   "ETS(Exports ~ error(\"~  2019 N(29, 14)  29.3        22.0        36.6
## 3 Chile   "ETS(Exports ~ error(\"~  2020 N(30, 21)  29.6        20.6        38.5
## 4 Chile   "ETS(Exports ~ error(\"~  2021 N(30, 28)  29.9        19.5        40.2

The 95% prediction intervals done manually for the first forecast for each of the two models using RMSE values are as follow:

#first model interval
m2 <- 28.7

rmse2 <- 2.56

low_y2 <- m2 - 1.96 * rmse2
up_y2 <- m2 + 1.96 * rmse2

paste(low_y2, up_y2)

## [1] "23.6824 33.7176"

#second model interval
m3 <- 28.99

rmse3 <- 2.55

low_y3 <- m3 - 1.96 * rmse3
up_y3 <- m3 + 1.96 * rmse3

paste(low_y3, up_y3)

## [1] "23.992 33.988"

The prediction intervals produced by R and those produced by hand are very close to each other.

Exercise 8.8.6

Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.

head(global_economy)

## # A tsibble: 6 x 9 [1Y]
## # Key:       Country [1]
##   Country     Code   Year         GDP Growth   CPI Imports Exports Population
##   <fct>       <fct> <dbl>       <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>
## 1 Afghanistan AFG    1960  537777811.     NA    NA    7.02    4.13    8996351
## 2 Afghanistan AFG    1961  548888896.     NA    NA    8.10    4.45    9166764
## 3 Afghanistan AFG    1962  546666678.     NA    NA    9.35    4.88    9345868
## 4 Afghanistan AFG    1963  751111191.     NA    NA   16.9     9.17    9533954
## 5 Afghanistan AFG    1964  800000044.     NA    NA   18.1     8.89    9731361
## 6 Afghanistan AFG    1965 1006666638.     NA    NA   21.4    11.3     9938414

#chinese gdp
china_gdp <- global_economy %>%
  select("Year", GDP) %>%
  filter(Country=="China")
autoplot(china_gdp) +
  labs(y="GDP", title="Chinese GDP")

## Plot variable not specified, automatically selected `.vars = GDP`

Let’s take a look at the components of these data:

#STL decomposition
dcmp <- china_gdp %>%
  model(stl = STL(GDP))

components(dcmp) %>% autoplot()

We can observe an exponential growth around the year 2005, when the trend starts an upward movement. We can also oberve minimal variability toward the latest years and there does not seem to be a seasonal component in these data.

Now we will take a look at some ETS() models:

#lambda for boxcox
lambda <- china_gdp %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)

# models
china_gdp %>%
  model(
    "Holt's Method" = ETS(GDP ~ error('A') + trend('A') + season('N')),
    "Damped Holt's Method" = ETS(GDP ~ error('A') + trend('Ad') + season('N')),
    "Box-Cox" = ETS(box_cox(GDP, lambda) ~ error('A') + trend('Ad') + season('N'))
  ) %>%
  forecast(h = 15) %>%
  autoplot(china_gdp, level = NULL) +
  labs(title = "China GDP",
       y = "GDP") +
  guides(colour = guide_legend(title = "Forecasts"))

We can observe above that Holt’s method seems to over-forecast as it has a constant upward trend. On the other hand, the damped method seems to slowly bring the upward trend to a straight horizontal line. Interestingly, the Box-Cox combination with the damped method seems to be over-forecasting above both the previous methods. Perhaps the lambda found by the guerrero function might have squared our original data, hence the exponential growth.

Exercise 8.8.7

Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?

head(aus_production)

## # A tsibble: 6 x 7 [1Q]
##   Quarter  Beer Tobacco Bricks Cement Electricity   Gas
##     <qtr> <dbl>   <dbl>  <dbl>  <dbl>       <dbl> <dbl>
## 1 1956 Q1   284    5225    189    465        3923     5
## 2 1956 Q2   213    5178    204    532        4436     6
## 3 1956 Q3   227    5297    208    561        4806     7
## 4 1956 Q4   308    5681    197    570        4418     6
## 5 1957 Q1   262    5577    187    529        4339     5
## 6 1957 Q2   228    5651    214    604        4811     7

aus_production %>%
  autoplot(Gas) +
  labs(title = "Australian Gas Production")

Below we can observe that we have an increasing trend, seasonality as well as variation.

#STL decomposition
dcmp2 <- aus_production %>%
  model(stl = STL(Gas))

components(dcmp2) %>% autoplot()

# models
fit4 <- aus_production %>%
  model(
    "Additive" = ETS(Gas ~ error('A') + trend('A') + season('A')),
    "Multiplicative" = ETS(Gas ~ error('M') + trend('A') + season('M')),
    "Damped Holt's Method" = ETS(Gas ~ error('A') + trend('Ad') + season('M'))
  )

fc4 <- fit4 %>%
  forecast(h = 15)

fc4 %>% autoplot(aus_production, level = NULL) +
  labs(title = "Australian Gas Production") +
  guides(colour = guide_legend(title = "Forecasts"))

accuracy(fit4)

## # A tibble: 3 x 10
##   .model           .type         ME  RMSE   MAE    MPE  MAPE  MASE RMSSE    ACF1
##   <chr>            <chr>      <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
## 1 Additive         Traini~  0.00525  4.76  3.35 -4.69  10.9  0.600 0.628  0.0772
## 2 Multiplicative   Traini~ -0.115    4.60  3.02  0.199  4.08 0.542 0.606 -0.0131
## 3 Damped Holt's M~ Traini~  0.548    4.22  2.81  1.32   4.11 0.505 0.556  0.0265

We can see above from the graph and accuracy metrics that the Multiplicative method is slightly better than the Additive method. The Multiplicative method is preferred when the seasonal variations are changing proportional to the level of the series. Additionally, the damped method does seem to slightly improve the forecasts.

Exercise 8.8.8

Recall your retail time series data (from Exercise 8 in Section 2.10).

set.seed(123)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

a. Why is multiplicative seasonality necessary for this series?

As observed previously with the data from gas production, we can observe that these data have an increasing trend, seasonality as well as variation. As discussed earlier, the Multiplicative method is preferred when the seasonal variations are changing proportional to the level of the series.

#STL decomposition
dcmp3 <- myseries %>%
  model(stl = STL(Turnover))

components(dcmp3) %>% autoplot()

we can also compare below the Multipliatie and Additive methods and conclude that the Multiplicative method is better.

fit5 <- myseries %>%
  model(
    "Additive" = ETS(Turnover ~ error('A') + trend('A') + season('A')),
    "Multiplicative" = ETS(Turnover ~ error('M') + trend('A') + season('M'))
  )

fc5 <- fit5 %>%
  forecast(h = 10)

fc5 %>% autoplot(myseries, level = NULL) +
  labs(title = "Household Goods Turnover") +
  guides(colour = guide_legend(title = "Forecasts"))

accuracy(fit5)

## # A tibble: 2 x 12
##   State Industry .model .type      ME  RMSE   MAE    MPE  MAPE  MASE RMSSE  ACF1
##   <chr> <chr>    <chr>  <chr>   <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Vict~ Househo~ Addit~ Trai~ -0.0163  26.4  20.0 -0.336  4.04 0.517 0.529 0.244
## 2 Vict~ Househo~ Multi~ Trai~  2.13    24.3  18.3  0.125  3.40 0.473 0.488 0.316

b. Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.

fit6 <- myseries %>%
  model(
    "Holt-Winter" = ETS(Turnover ~ error('M') + trend('Ad') + season('M')),
    "Damped Holt's Method" = ETS(Turnover ~ error('A') + trend('Ad') + season('M'))
  )

fc6 <- fit6 %>%
  forecast(h = 15)

fc6 %>% autoplot(myseries, level = NULL) +
  labs(title = "Household Goods Turnover") +
  guides(colour = guide_legend(title = "Forecasts"))

c. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?

In this case the damped methods seems to be slightly better than the Holt-Winter method, and they both clearly outperform the Multiplicative method in our previous example.

accuracy(fit6)

## # A tibble: 2 x 12
##   State    Industry .model .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1
##   <chr>    <chr>    <chr>  <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 Victoria Househo~ Holt-~ Trai~  2.37  23.7  17.1 0.331  3.12 0.441 0.475 0.0862
## 2 Victoria Househo~ Dampe~ Trai~  2.35  23.6  16.9 0.258  3.16 0.436 0.472 0.0250

d. Check that the residuals from the best method look like white noise.

It seems that the residuals on the time plot show a slight increase in variability from the year 2000 and on. We can also observe that there is some correlation on the ACF of the residuals but the histogram seems to be close to normal.

best_fit <- myseries %>%
  model(
    "Damped Holt's Method" = ETS(Turnover ~ error('A') + trend('Ad') + season('M'))
  )

best_fc <- best_fit %>%
  forecast(h = 15)

best_fit %>% gg_tsresiduals()

e. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?

myseries_train <- myseries %>%
  filter(year(Month) < 2011)

fit_train <- myseries_train %>%
  model(
    "SNAIVE" = SNAIVE(Turnover),
    "Damped Holt's Method" = ETS(Turnover ~ error('A') + trend('Ad') + season('M'))
  )

fc_train <- fit_train %>%
  forecast(h = 15)

fc_train %>% autoplot(myseries_train, level = NULL) +
  labs(title = "Household Goods Turnover") +
  guides(colour = guide_legend(title = "Forecasts"))

fc_train %>% accuracy(myseries)

## # A tibble: 2 x 12
##   .model   State Industry  .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>    <chr> <chr>     <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Damped ~ Vict~ Househol~ Test  -28.2  41.8  33.6 -3.39  3.92 0.947 0.918 0.660
## 2 SNAIVE   Vict~ Househol~ Test   33.7  42.3  36.0  3.73  4.00 1.02  0.928 0.411

We can conclude from the above graph and accuracy metrics that the damped method is superior to the seasonal naive approach.

Exercise 8.8.9

For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?

#lambda for boxcox
lambda2 <- myseries_train %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

#boxcox transformation
myseries_train_bx <- myseries_train
myseries_train_bx$Turnover <- box_cox(myseries_train$Turnover,lambda2)

As we can observe below, the Box-Cox transformation made the seasonality a lot more constant as well as the variability in the remainder.

#STL decomposition
dcmp4 <- myseries_train_bx %>%
  model(stl = STL(Turnover))

components(dcmp4) %>% autoplot()

fit_train2 <- myseries_train_bx %>%
  model(
    "Damped Holt's Method" = ETS(Turnover ~ error('A') + trend('Ad') + season('M'))
  )

fc_train2 <- fit_train2 %>%
  forecast(h = 15)

fc_train2 %>% autoplot(myseries_train_bx, level = NULL) +
  labs(title = "Household Goods Turnover") +
  guides(colour = guide_legend(title = "Forecasts"))

accuracy(fit_train)

## # A tibble: 2 x 12
##   State    Industry .model .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE    ACF1
##   <chr>    <chr>    <chr>  <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>
## 1 Victoria Househo~ SNAIVE Trai~ 25.1   45.6  35.4 5.05   7.29 1     1      0.695 
## 2 Victoria Househo~ Dampe~ Trai~  2.12  20.3  14.8 0.315  3.33 0.417 0.445 -0.0345

accuracy(fit_train2)

## # A tibble: 1 x 12
##   State    Industry    .model   .type     ME  RMSE   MAE   MPE  MAPE  MASE RMSSE
##   <chr>    <chr>       <chr>    <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Victoria Household ~ Damped ~ Trai~ 0.0238 0.231 0.174 0.137  1.14 0.434 0.465
## # ... with 1 more variable: ACF1 <dbl>

According to our accuracy metrics above, it appears that the model improved with the Box-Cox transformation and it now outperforms the best version of our model in the previous exercise. However, I believe we would have to scale the data back to their original form to make a more accurate comparison.