3.1. Consider the the number of pigs slaughtered in Victoria,
available in the aus_livestock dataset.
ETS() function to estimate the equivalent model
for simple exponential smoothing. Find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the
next four months.First taking a look at the raw data for number of pigs slaughtered in Victoria, it looks like there is no trend and no clearly identifiable seasonality. This varying seasonality suggests that the additive naive method could be best to forecast this data.
Vic_pigs <- aus_livestock %>%
filter(State == "Victoria", Animal == "Pigs")
ggplotly(Vic_pigs %>%
autoplot(Count) +
labs(y = "Count", title = "Victoria counts of pigs slaughtered"))
Fitting a model using the ‘ETS()’ function, set the error to “A”
(Additive), the trend to “N” (None) and the season to “N” (None). Using
the report() function we can see the \(\alpha\) and \(\ell_0\) values. We can see below that R
chooses 0.3221 and 100646.6 respectively.
fit <- Vic_pigs %>%
model(ETS(Count ~ error("A") + trend("N") + season("N")))
report(fit)
## Series: Count
## Model: ETS(A,N,N)
## Smoothing parameters:
## alpha = 0.3221247
##
## Initial states:
## l[0]
## 100646.6
##
## sigma^2: 87480760
##
## AIC AICc BIC
## 13737.10 13737.14 13750.07
Forecasting for the next four months, the forecast stays constant at 95,187 since this is a naive method.
fc_pigs <- fit %>%
forecast(h = 4)
fc_pigs$Count
## <distribution[4]>
## [1] N(95187, 8.7e+07) N(95187, 9.7e+07) N(95187, 1.1e+08) N(95187, 1.1e+08)
The residuals of the ETS model are obtained from the
augment() function.
(res_pigs <- augment(fit))
## # A tsibble: 558 x 8 [1M]
## # Key: Animal, State, .model [1]
## Animal State .model Month Count .fitted .resid .innov
## <fct> <fct> <chr> <mth> <dbl> <dbl> <dbl> <dbl>
## 1 Pigs Victoria "ETS(Count ~ error(\~ 1972 Jul 94200 100647. -6447. -6447.
## 2 Pigs Victoria "ETS(Count ~ error(\~ 1972 Aug 105700 98570. 7130. 7130.
## 3 Pigs Victoria "ETS(Count ~ error(\~ 1972 Sep 96500 100867. -4367. -4367.
## 4 Pigs Victoria "ETS(Count ~ error(\~ 1972 Oct 117100 99460. 17640. 17640.
## 5 Pigs Victoria "ETS(Count ~ error(\~ 1972 Nov 104600 105142. -542. -542.
## 6 Pigs Victoria "ETS(Count ~ error(\~ 1972 Dec 100500 104968. -4468. -4468.
## 7 Pigs Victoria "ETS(Count ~ error(\~ 1973 Jan 94700 103529. -8829. -8829.
## 8 Pigs Victoria "ETS(Count ~ error(\~ 1973 Feb 93900 100685. -6785. -6785.
## 9 Pigs Victoria "ETS(Count ~ error(\~ 1973 Mar 93200 98499. -5299. -5299.
## 10 Pigs Victoria "ETS(Count ~ error(\~ 1973 Apr 78000 96792. -18792. -18792.
## # ... with 548 more rows
And the standard deviation of 9344.67 is derived from the .resid variable.
(sd_pigs <- sd(res_pigs$.resid))
## [1] 9344.666
The first estimated value in the forecast model produced by R is 95187.
(L_1 <- mean(fc_pigs$.mean[1]))
## [1] 95186.56
Having all of this allows for the calculation of the 95% prediction interval for the first forecast using \(\hat{y} \pm 1.96s\).
(interval_pigs_high <- L_1 + (1.96 * sd_pigs))
## [1] 113502.1
(interval_pigs_low <- L_1 - (1.96 * sd_pigs))
## [1] 76871.01
The calculated 95% prediction interval is bounded from 76871.01 to 113502.1 Comparing that to the interval computed by the model, which is 76854.79 to 113518.3, the bounds of the model are just slightly larger than the mathematically calculated bounds.
interval_pigs_model <- hilo(fc_pigs)
interval_pigs_model$`95%`[1]
## <hilo[1]>
## [1] [76854.79, 113518.3]95
Data set global_economy contains the annual Exports from
many countries. Select one country to analyse.
Looking at exports from France, there is an upward trend but no seasonality from 1960 to 2017. There were periods of rapid growth and rapid decline, the overall trend was positive growth.
Fr_Exports <- global_economy %>%
filter(Country == "France")
Fr_Exports %>%
autoplot(Exports) +
labs(y = "Count", title = "France exports")
The forecast looks out 10 years as shown on the below plot.
fit_Fr <- Fr_Exports %>%
model(ETS(Exports ~ error("A") + trend("N") + season("N")))
fc_Fr <- fit_Fr %>%
forecast(h = 10)
(fc_Fr %>%
autoplot(Fr_Exports) +
labs(y = "Count", title = "France exports forecast using ETS(A,N,N)"))
Checking the fit training data for Root mean squared error (RMSE) returns 1.152003.
accuracy(fit_Fr)[5]
## # A tibble: 1 x 1
## RMSE
## <dbl>
## 1 1.15
The RMSE of the trended forecast calculates to 1.11996.
fit_Fr2 <- Fr_Exports %>%
model(ETS(Exports ~ error("A") + trend("A") + season("N")))
accuracy(fit_Fr2)[5]
## # A tibble: 1 x 1
## RMSE
## <dbl>
## 1 1.12
fc_Fr2 <- fit_Fr2 %>%
forecast(h = 10)
(fc_Fr2 %>%
autoplot(Fr_Exports) +
labs(y = "Count", title = "France exports forecast using ETS(A,A,N)"))
Using an Arithmetic trend in the second forecast reflects the overall upward trend in the forecast results as seen on the second plot, and the lower value for RMSE in the second model indicates that it would be preferred over the original naive forecast, because this second forecast accounts for the trend properly while the original forecast did not account for any trend.
# model 1
res_Fr <- augment(fit_Fr)
sd_Fr <- sd(res_Fr$.resid)
L_1_Fr <- mean(fc_Fr$.mean[1])
interval_Fr_high <- L_1_Fr + (1.96 * sd_Fr)
interval_Fr_low <- L_1_Fr - (1.96 * sd_Fr)
# model 2
res_Fr2 <- augment(fit_Fr2)
sd_Fr2 <- sd(res_Fr2$.resid)
L_1_Fr2 <- mean(fc_Fr2$.mean[1])
interval_Fr2_high <- L_1_Fr2 + (1.96 * sd_Fr2)
interval_Fr2_low <- L_1_Fr2 - (1.96 * sd_Fr2)
paste("The 95% prediction interval for model 1 is", interval_Fr_low, "to", interval_Fr_high, "while for model 2 it is",interval_Fr2_low, "to", interval_Fr2_high,".")
## [1] "The 95% prediction interval for model 1 is 28.6745409558786 to 33.0890216655963 while for model 2 it is 28.9598961226832 to 33.3882664753598 ."
paste("The span of the model 1 is",interval_Fr_high-interval_Fr_low,"and for model 2 it is",interval_Fr2_high-interval_Fr2_low)
## [1] "The span of the model 1 is 4.41448070971769 and for model 2 it is 4.42837035267668"
The 95% predication intervals produced using R for the first model is 28.5849 to 33.1796, and for the second model it is 28.8992 to 33.4490. The span for model 1 is 4.5957 and for model 2 the span is 4.54986. Both of these spans are slightly larger than the spans calculated by hand.
interval_Fr1_model <- hilo(fc_Fr)
interval_Fr1_model$`95%`[1]
## <hilo[1]>
## [1] [28.58393, 33.17963]95
interval_Fr2_model <- hilo(fc_Fr2)
interval_Fr2_model$`95%`[1]
## <hilo[1]>
## [1] [28.89915, 33.44901]95
Forecast the Chinese GDP from the global_economy data
set using an ETS model. Experiment with the various options in the
ETS() function to see how much the forecasts change with
damped trend, or with a Box-Cox transformation. Try to develop an
intuition of what each is doing to the forecasts.
[Hint: use a relatively large value of h when
forecasting, so you can clearly see the differences between the various
options when plotting the forecasts.]
GDP in China show trend but no seasonality. Therefore, in the models the season parameter should be set to “N” while the other parameters may be adjusted for comparison purposes.
Ch_GDP <- global_economy %>%
filter(Country == "China")
Ch_GDP %>%
autoplot(GDP) +
labs(y = "Count", title = "China GDP")
Utilizing the ETS model with several variations shows the potential impacts of different models on forecasts of the same underlying data. The model generated six different model types: additive, multiplicative, additive damped with \(\phi = 0.8\), additive damped with \(\phi = 0.95\), multiplicative damped with \(\phi = 0.8\) and multiplicative damped with \(\phi = 0.95\). The results are interesting, in that additive and multiplicative appear to have almost the same effect of continuing the projection rapidly towards infinity, which is not possible economically even in a State-controlled economy such as China. The additive and multiplicative models damped at \(\phi = 0.95\) begin to bend the curve downwards (away from infinity), but it is the models damped at \(\phi = 0.8\) that begin to produce what economists would be more likely to agree upon as a reasonable model, in that the curvature created appears to head more towards a leveling out of the economy at some point in the future. The multiplicative damped displaying the most curvature. Thus, in a free market economy one might expect the multiplicative damped at \(\phi = 0.8\) to be the most acceptable model, but in the State-controlled Chinese economy that may not actually be the preferred forecast depending on party level political agendas.
fit_Ch <- Ch_GDP %>%
model(
additive = ETS(GDP ~ error("A") + trend("A") + season("N")),
multiplicative = ETS(GDP ~ error("M") + trend("A") + season("N")),
add_damped_0.8 = ETS(GDP ~ error("A") + trend("Ad",phi=0.8) + season("N")),
add_damped_0.95 = ETS(GDP ~ error("A") + trend("Ad",phi=0.95) + season("N")),
mult_damped_0.8 = ETS(GDP ~ error("M") + trend("Ad",phi=0.8) + season("N")),
mult_damped_0.95 = ETS(GDP ~ error("M") + trend("Ad",phi=0.95) + season("N"))
)
fc_Ch <- fit_Ch %>%
forecast(h = 10)
(fc_Ch %>%
autoplot(Ch_GDP, level = NULL) +
labs(y = "Count", title = "China GDP forecasts using various models"))
Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?
Australian gas production is seasonal with increasing variance in the seasonality as time progresses. It also shows a clearly increasing trend overall.
Gas <- aus_production
Gas %>%
autoplot(Gas) +
labs(y = "Petajoules", title = "Australian Gas production")
The multiplicative seasonality continues the increase in the variance that we saw in the original data, and the Additive trend continues the positive slope from the historical production levels. The damped slopes flatten out the positive trend and appear to diminish the growth in variance in the seasonality in the plots below.
fit_Gas <- Gas %>%
model(
multiplicative = ETS(Gas ~ error("M") + trend("A") + season("M")),
add_damped_0.8 = ETS(Gas ~ error("A") + trend("Ad",phi=0.8) + season("M")),
add_damped_0.95 = ETS(Gas ~ error("A") + trend("Ad",phi=0.95) + season("M"))
)
fc_Gas <- fit_Gas %>%
forecast(h = 30)
(fc_Gas %>%
autoplot(Gas, level = NULL) +
labs(y = "Petajoules", title = "Australian Gas production") +
facet_grid(.model~.))
Recall your retail time series data (from Exercise 8 in Section 2.10).
This again refers to the aus_retail dataset showing
retail turnover from 1982 to 2019. As a refresher, the original series
is shown in the below plot.
set.seed(54321)
myseries <- aus_retail %>%
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
autoplot(myseries,Turnover) +
labs(title = "Turnover by Month")
Multiplicative seasonality is necessary for this series because the variance in the seasonality increases over time, and multiplicative seasonality retains and respects the changes in the variance over time in its forecasting methodology.
Again the strict multiplicative method incorporates more of the additive upward trend in the overall forecast, while the dampened methods at the programmatic upper and lower limits of \(\phi\) (respectively at 0.8 and 0.98) show a dampened trend much less than the additive trend in the strict model.
fit_myseries <- myseries %>%
model(
multiplicative = ETS(Turnover ~ error("A") + trend("A") + season("M")),
add_damped_0.8 = ETS(Turnover ~ error("A") + trend("Ad",phi=0.8) + season("M")),
add_damped_0.98 = ETS(Turnover ~ error("A") + trend("Ad",phi=0.98) + season("M"))
)
fc_myseries <- fit_myseries %>%
forecast(h = 48)
(fc_myseries %>%
autoplot(myseries, level = NULL) +
labs(y = "Turnover", title = "Turnover by Month") +
facet_grid(.model~.))
The RMSE for the damped model at 0.80 shows that it is empirically the best choice option with the lowest score. This is intuitive for a subject matter such as Turnover, where variance may be increasing, but even the most recent actual trend shows both decline and increase since 2010. Therefore, eliminating constant upward trend from the forecast makes more sense.
fit_myseries %>%
accuracy() %>%
select(.model,RMSE)
## # A tibble: 3 x 2
## .model RMSE
## <chr> <dbl>
## 1 multiplicative 3.02
## 2 add_damped_0.8 2.99
## 3 add_damped_0.98 3.11
There is one large residual in 2010 that is visible on all three plots as a spike or an extreme outlier, and on the lag plot there are three other points that fall just outside the bounds of insignificance (i.e. they hold at least minor significance) to the model. The residual histogram looks normal with a small number of outliers.
(bestfit_myseries_res <- myseries %>%
model(add_damped_0.8 = ETS(Turnover ~ error("A") + trend("Ad",phi=0.8) + season("M"))) %>%
gg_tsresiduals())
The training model to the end of 2010 returns an RMSE of 2.529164, while the test set from 2011 forward returns an RMSE of 3.733941. The test set RMSE beats the RMSE of the original seasonal naive model which was 5.637434.
myseries_train <- myseries %>%
filter(year(Month) < 2011) # split the train set up to 2010
fit_train <- myseries_train %>% # fit the new train set
model(add_damped_0.8 = ETS(Turnover ~ error("A") + trend("Ad",phi=0.8) + season("M")))
myseries_test <- myseries %>% # create a test set from 2011 forward
filter(year(Month) > 2010)
fit_test <- myseries_test %>% # fit the test set
model(add_damped_0.8 = ETS(Turnover ~ error("A") + trend("Ad",phi=0.8) + season("M")))
fit_5.11 <- myseries_train %>%
model(SNAIVE(Turnover)) # compare to Ex 7 from Section 5.11
fit_train %>%
accuracy() %>%
select(.model,RMSE)
## # A tibble: 1 x 2
## .model RMSE
## <chr> <dbl>
## 1 add_damped_0.8 2.53
fit_test %>%
accuracy() %>%
select(.model,RMSE)
## # A tibble: 1 x 2
## .model RMSE
## <chr> <dbl>
## 1 add_damped_0.8 3.73
fit_5.11 %>%
accuracy() %>%
select(.model,RMSE)
## # A tibble: 1 x 2
## .model RMSE
## <chr> <dbl>
## 1 SNAIVE(Turnover) 5.64
For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
Taking the same training data and applying a Box-Cox transformation to the series and then taking an STL decomposition, we can compare it to the original STL decomposition of the full series.
Here is the original decomposition of the series:
retail_dcmp <- myseries %>%
model(STL(Turnover ~ trend(window = 10) +
season(window = "periodic"),
robust = TRUE)) %>%
components()
autoplot(retail_dcmp) +
labs(title = "Decomposition of Australian retail turnover - original")
Now performing the Box-Cox transform on the training set through Dec 2010, in which we see a considerable decrease in scale for all components of the series and the series itself compared to the original STL decomposition. There is considerable smoothing in the trend component, particularly from 2009 to 2015, and the remainder is much more consistent.
lambda_train <- myseries_train %>%
features(Turnover, features = guerrero) %>% # get lambda for the train set
pull(lambda_guerrero)
myseries_train_adjusted <-
myseries_train %>%
mutate(turnover_transform = box_cox(Turnover, lambda_train)) # apply the transformation with lambda_train
myseries_train_adjusted %>% # perform the STL decomp
model(STL(turnover_transform ~ trend(window = 10) +
season(window = "periodic"),
robust = TRUE)) %>%
components() %>%
autoplot() +
labs(title = "Decomposition of Australian retail turnover - Train set")
Likewise seasonally adjusting the test set for comparision purposes. There is certainly still something odd about the transformed trend, bnut the seasonality looks more consistent.
lambda_test <- myseries_test %>%
features(Turnover, features = guerrero) %>% # get lambda for the test set
pull(lambda_guerrero)
myseries_test_adjusted <-
myseries_test %>%
mutate(turnover_transform = box_cox(Turnover, lambda_test)) # apply the transformation with lambda_test
myseries_test_decomp <- myseries_test_adjusted %>% # perform the STL decomp
model(STL(turnover_transform ~ trend(window = 10) +
season(window = "periodic"),
robust = TRUE))
myseries_test_decomp %>%
components() %>%
autoplot() +
labs(title = "Decomposition of Australian retail turnover - Test set")
Now taking the transformed sets and forecasting on the seasonally adjusted data, the training set returns an RMSE of 0.1174647 and the test set returns an RMSE of only 0.08381113, which is a big improvement over the original test set RMSE of 3.733941.
fit_train_adjusted <- myseries_train_adjusted %>% # fit the new adjusted train set
model(add_damped_0.8 = ETS(turnover_transform ~ error("A") + trend("Ad",phi=0.8) + season("M")))
fit_train_adjusted %>%
accuracy() %>%
select(.model,RMSE)
## # A tibble: 1 x 2
## .model RMSE
## <chr> <dbl>
## 1 add_damped_0.8 0.117
fit_test_adjusted <- myseries_test_adjusted %>% # fit the new adjusted test set
model(add_damped_0.8 = ETS(turnover_transform ~ error("A") + trend("Ad",phi=0.8) + season("M")))
fit_test_adjusted %>%
accuracy() %>%
select(.model,RMSE)
## # A tibble: 1 x 2
## .model RMSE
## <chr> <dbl>
## 1 add_damped_0.8 0.0838
Looking at this visually, here is the original data:
myseries %>% autoplot(Turnover)
And the seasonally adjusted test set v seasonally adjusted training forecast:
myseries_train_adjusted <- myseries_train %>%
mutate(turnover_transform = box_cox(Turnover, lambda_test)) # apply the transformation with lambda_test
fit_myseries_train_adjusted <- myseries_train_adjusted %>%
model(add_damped_0.8 = ETS(turnover_transform ~ error("A") + trend("Ad",phi=0.8) + season("M")))
fc_train_adjusted <- fit_myseries_train_adjusted %>%
forecast(h = 96)
fc_compare_train <- fit_myseries_train_adjusted %>%
forecast(new_data = anti_join(myseries_train_adjusted, fc_train_adjusted))
## Joining, by = c("State", "Industry", "Month", "turnover_transform")
fc_train_adjusted %>% autoplot(myseries_train_adjusted)
The training forecast for 2011 through 2019 follows essentially the same shape within the 95% confidence interval as the actuals in the original dataset for those years, and it also resembles the shape of the best fit forecast from the above test set in question 8.8.e.