library(fpp3)
library(forecast)
Do the exercises 9.1, 9.2, 9.3, 9.5, 9.6, 9.7, 9.8 in Hyndman. Please submit both the Rpubs link as well as your .rmd file.
One main difference between the three plots are that the autocorrelations for the smaller sample size are larger when compared to the larger sample size. The autocorrelations measures the relationship between lagged values; if a sample size is large and white noise, we would expect that the autocorrelations are smaller as there are more random samples to show that the lagged values do not have a linear relationship. Per the answer to question 1b below, it would also make sense that the critical values are also smaller in range for larger sample sizes to account for this. Despite these differences, all plots shown indicate white noise as the autocorrelations do not exceed the critical values.
The critical values are at different distances from the mean of zero because they take into account the sample size. Critical values for ACF plots are calculated with ± 1.96 \(\sqrt{T}\) where T is the length of the time series. As such, larger sample sizes would result in critical values closer to the mean of zero while smaller sample sizes would result in critical values further from the mean of zero. This is demonstrated by the plot as well. Autocorrelations differ in each figure despite referring to white noise due to random variation in the data.
Looking at the data, we can see that the variance is not constant - it is larger towards the recent years when compared to the beginning of the time series. Looking at the ACF, we can see that the data is non-stationary because the autocorrelations are decreasing slowly. The autocorrelations are also large and positive. From the PACF, we can see that it shows non-stationarity as there are spikes throughout the plot.
Using the unitroot_ndiffs feature, I can determine how many orders of differencing is required. The output was 1, so I differenced the data once. However, I can that the variance still is constant after differencing. The ACF plot no longer shows large autocorrelations decreasing slowly, but PACF still shows large spikes that do not decrease with lag.
Differencing a second time did not appear to change this, as the variance towards the more recent timepoints is drastically different than the beginning of the plot. Instead, a Box-Cox transformation should first be performed on the original data to adjust for the non-constant variance. Since the data is stock data, I first needed to update the tsibble to use a row number index to account for missing data on the weekend. Afterwards, a Box-Cox transformation could be transformed. Afterwards, differencing the data shows a constant variance without the slow decreasing pattern in the ACF.
closing_amzn <- gafa_stock %>%
filter(Symbol == 'AMZN') %>% select(Close)
closing_amzn %>% autoplot()
closing_amzn %>% gg_tsdisplay(Close,plot_type = "partial")
closing_amzn %>% features(Close, unitroot_ndiffs)
## # A tibble: 1 × 1
## ndiffs
## <int>
## 1 1
closing_amzn_diff <- closing_amzn %>%
mutate(diff_close = difference(Close))
closing_amzn_diff %>% autoplot(diff_close)
closing_amzn_diff %>% gg_tsdisplay(diff_close,plot_type = "partial")
closing_amzn_diff2 <- closing_amzn %>%
mutate(diff_close = difference(difference(Close)))
closing_amzn_diff2 %>% gg_tsdisplay(diff_close,plot_type = "partial")
closing_amzn_updated <- closing_amzn %>%
mutate(new_index = row_number()) %>%
update_tsibble(index = new_index, regular = TRUE) %>%
select(Close)
lambda_guerrero <- closing_amzn_updated %>%
features(Close, features = guerrero)
lambda_guerrero <- lambda_guerrero$lambda_guerrero
closing_amzn_updated <- closing_amzn_updated %>%
mutate(box_cox = box_cox(Close,lambda_guerrero)) %>%
select(box_cox)
closing_amzn_updated %>% autoplot()
closing_amzn_updated %>%
gg_tsdisplay(box_cox,plot_type = "partial")
closing_amzn_updated_diff <- closing_amzn_updated %>%
mutate(diff_close = difference(box_cox))
closing_amzn_updated_diff %>% autoplot()
closing_amzn_updated_diff %>%
gg_tsdisplay(diff_close,plot_type = "partial")
For the Turkish GDP from global economy, after performing a Box-Cox transformation with lambda of approx 0.157, first order differencing is needed to obtain stationary data. This is determined by the KPSS test, where the null hypothesis is that the data is stationary. Since the p-value is null, we reject the hypothesis and need to difference the data. Using features(,unitroot_ndiffs) I can find how many times the data should be differenced. This returns 1, which means I need to difference the data once. Looking at the ACF I can see that it is no longer exhibits the slowly decreasing pattern as it had before differencing.
Turkish_GDP <- global_economy %>%
filter(Country == 'Turkey') %>% select(GDP)
Turkish_GDP %>% autoplot()
lambda_guerrero <- Turkish_GDP %>%
features(GDP, features = guerrero)
lambda_guerrero <- lambda_guerrero$lambda_guerrero
Turkish_GDP_updated <- Turkish_GDP %>%
mutate(box_cox = box_cox(GDP,lambda_guerrero)) %>%
select(box_cox)
Turkish_GDP_updated %>% autoplot()
Turkish_GDP_updated %>%
gg_tsdisplay(box_cox,plot_type = "partial")
Turkish_GDP_updated %>%
features(box_cox, unitroot_kpss)
## # A tibble: 1 × 2
## kpss_stat kpss_pvalue
## <dbl> <dbl>
## 1 1.50 0.01
Turkish_GDP_updated %>%
features(box_cox, unitroot_ndiffs)
## # A tibble: 1 × 1
## ndiffs
## <int>
## 1 1
Turkish_GDP_updated_diff <- Turkish_GDP_updated %>%
mutate(diff = difference(box_cox))
Turkish_GDP_updated_diff %>%
gg_tsdisplay(diff,plot_type = "partial")
The accomodation takings in the state of Tasmania exhibits upward trend, and therefore needs differencing to make the data stationary. First, I performed a Box-Cox transformation; since the ideal lambda is close to 0, I used 0. After the Box-Cox transformation, I can see the ACF is decreasing slowly. Using the unitroot_nsdiffs and unitroot_ndiffs in features(), I can determine that 1 order of seasonal differencing and 1 order of non-seasonal differencing is required. After differencing, the ACF no longer exhibits the slowly decreasing pattern and variance apears constant.
Tasmania_takings <- aus_accommodation %>%
filter(State == 'Tasmania') %>% select(Takings)
Tasmania_takings %>% autoplot()
Tasmania_takings %>% gg_tsdisplay()
lambda_guerrero <- Tasmania_takings %>%
features(Takings, features = guerrero)
lambda_guerrero <- lambda_guerrero$lambda_guerrero
Tasmania_takings_updated <- Tasmania_takings %>%
mutate(box_cox = box_cox(Takings,0)) %>% select(box_cox)
Tasmania_takings_updated %>%
gg_tsdisplay(box_cox,plot_type = "partial")
Tasmania_takings_updated %>%
features(box_cox,unitroot_nsdiffs)
## # A tibble: 1 × 1
## nsdiffs
## <int>
## 1 1
Tasmania_takings_updated %>%
features(box_cox,unitroot_ndiffs)
## # A tibble: 1 × 1
## ndiffs
## <int>
## 1 1
Tasmania_takings_updated_diff <- Tasmania_takings_updated %>%
mutate(diff = difference(difference(box_cox, 4)))
Tasmania_takings_updated_diff %>%
gg_tsdisplay(diff,plot_type = "partial")
Tasmania_takings %>% features(Takings,unitroot_nsdiffs)
## # A tibble: 1 × 1
## nsdiffs
## <int>
## 1 1
Tasmania_takings %>% features(Takings,unitroot_ndiffs)
## # A tibble: 1 × 1
## ndiffs
## <int>
## 1 1
Tasmania_takings_diff <- Tasmania_takings %>%
mutate(diff = difference(difference(Takings, 4)))
Tasmania_takings_diff %>% gg_tsdisplay(diff,plot_type = "partial")
Performing the same process as above, using unitroot_nsdiffs and unitroot_ndiffs in features(), we can see that 1 order of seasonal and 1 order of non seasonal differencing is required on the log transformed data. Again, lambda of 0 is used due to the closeness of the ideal lambda value to 0. Looking at the data, the variance seems fairly constant. The ACF is not decreasing slowly like in the original plot. The PACF also appears to no longer have significant spiles past the first few values.
souvenirs %>% autoplot()
lambda_guerrero <- souvenirs %>%
features(Sales, features = guerrero)
lambda_guerrero <- lambda_guerrero$lambda_guerrero
souvenirs_updated <- souvenirs %>%
mutate(box_cox = box_cox(Sales,0)) %>%
select(box_cox)
souvenirs_updated %>% autoplot()
souvenirs_updated %>%
gg_tsdisplay(box_cox,plot_type = "partial")
souvenirs_updated %>% features(box_cox, unitroot_ndiffs)
## # A tibble: 1 × 1
## ndiffs
## <int>
## 1 1
souvenirs_updated %>% features(box_cox, unitroot_nsdiffs)
## # A tibble: 1 × 1
## nsdiffs
## <int>
## 1 1
souvenirs_updated_diff <- souvenirs_updated %>%
mutate(diff = difference(difference(box_cox, 12)))
souvenirs_updated_diff %>% autoplot(diff)
souvenirs_updated_diff %>%
gg_tsdisplay(diff,plot_type = "partial")
For my retail data, I performed a box_cox transformation with lambda equal to approximately 0.15. Afterwards, the unit root tests determined that 1 order of seasonal and non seasonal differencing was required. Looking at the plot, the variance appears constant. Looking at the ACF, it does not decline slowly, although the PACF still shows spikes beyond the first few lags.
set.seed(32)
myseries <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
myseries %>% autoplot()
lambda_guerrero <- myseries %>%
features(Turnover, features = guerrero)
lambda_guerrero <- lambda_guerrero$lambda_guerrero
myseries_updated <- myseries %>%
mutate(box_cox = box_cox(Turnover,lambda_guerrero)) %>%
select(box_cox)
myseries_updated %>% autoplot()
myseries_updated %>% features(box_cox, unitroot_ndiffs)
## # A tibble: 1 × 1
## ndiffs
## <int>
## 1 1
myseries_updated %>% features(box_cox, unitroot_nsdiffs)
## # A tibble: 1 × 1
## nsdiffs
## <int>
## 1 1
myseries_updated_diff <- myseries_updated %>%
mutate(diff = difference(difference(box_cox, 12)))
myseries_updated_diff %>% autoplot(diff)
myseries_updated_diff %>%
gg_tsdisplay(diff,plot_type = "partial")
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
The time series looks to have less local variation in the data when \(\phi_1\) is increased in the positive direction. When \(\phi_1\) is decreased, the data appears to have more oscillation. The further \(\phi_1\) is from 0, the value range seems to increase.
sim %>% autoplot()
phi <- -0.99
for(i in 2:100)
y[i] <- phi*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
sim %>% autoplot()
phi <- 0.99
for(i in 2:100)
y[i] <- phi*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
sim %>% autoplot()
phi <- 0.2
for(i in 2:100)
y[i] <- phi*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
sim %>% autoplot()
for(i in 2:100)
y[i] <- 0.6*e[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
Negative values of \(\theta_1\) also appear to have a greater oscillating effect on the data. Similar to the effect of \(\phi_1\), it also appears that the further \(\theta_1\) is from 0, the value range increases.
sim %>% autoplot()
theta <- -0.99
for(i in 2:100)
y[i] <- theta*e[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
sim %>% autoplot()
theta <- 0.99
for(i in 2:100)
y[i] <- theta*e[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
sim %>% autoplot()
theta <- 0.2
for(i in 2:100)
y[i] <- theta*e[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
sim %>% autoplot()
An ARMA(1,1) model is without differencing and takes the form \(y_t = c + \phi_1y_{t-1}+\theta_1\epsilon_{t-1}+\epsilon_t\).
for(i in 2:100)
y[i] <- 0.6*y[i-1] + 0.6*e[i-1] + e[i]
sim1 <- tsibble(idx = seq_len(100), y = y, index = idx)
An AR(2) takes the form \(y_t = c + \phi_1y_{t-1}+\phi_2y_{t-2}+\epsilon_t\). Since \(y_{t-2}\) is now included, i must start at 3.
for(i in 3:100)
y[i] <- -0.8*y[i-1] + 0.3*y[i-2] + e[i]
sim2 <- tsibble(idx = seq_len(100), y = y, index = idx)
The AR(2) model shows multiplicative, non-stationary behavior where the data begins with small oscillations that become greater and greater over time. The ARMA(1,1) model also appears non-stationary, though it also appears far more random than the AR(2) model. This is due to the \(\phi_1\) coefficient, which influenced the data’s strong oscillating behavior.
sim1 %>% autoplot()
sim2 %>% autoplot()
sim1 %>% features(y, unitroot_ndiffs)
## # A tibble: 1 × 1
## ndiffs
## <int>
## 1 1
sim1 %>% features(y, unitroot_nsdiffs)
## # A tibble: 1 × 1
## nsdiffs
## <int>
## 1 0
sim1 %>% gg_tsdisplay(y,plot_type = "partial")
sim2 %>% features(y, unitroot_ndiffs)
## # A tibble: 1 × 1
## ndiffs
## <int>
## 1 0
sim2 %>% features(y, unitroot_nsdiffs)
## # A tibble: 1 × 1
## nsdiffs
## <int>
## 1 0
sim2 %>% gg_tsdisplay(y,plot_type = "partial")
aus_airpassengers_1970_2011 <- aus_airpassengers %>%
filter(Year >= 1970, Year <= 2011)
fit <- aus_airpassengers_1970_2011 %>%
model(ARIMA(Passengers))
gg_tsresiduals(fit)
report(fit)
## Series: Passengers
## Model: ARIMA(0,2,1)
##
## Coefficients:
## ma1
## -0.8756
## s.e. 0.0722
##
## sigma^2 estimated as 4.671: log likelihood=-87.8
## AIC=179.61 AICc=179.93 BIC=182.99
fc <- forecast(fit, h = 10)
fc %>% autoplot() + autolayer(aus_airpassengers_1970_2011)
This is an ARIMA(0,2,1) model, which can be described as \((1-B)^2y_t = (1-0.8963B) \epsilon_t\)
The model from part A appears to follow the overall trend of the data better.
fit2 <- aus_airpassengers_1970_2011 %>% model(ARIMA(Passengers ~ pdq(0,1,0)))
gg_tsresiduals(fit2)
report(fit2)
## Series: Passengers
## Model: ARIMA(0,1,0) w/ drift
##
## Coefficients:
## constant
## 1.3669
## s.e. 0.3319
##
## sigma^2 estimated as 4.629: log likelihood=-89.08
## AIC=182.17 AICc=182.48 BIC=185.59
fc2 <- forecast(fit2, h = 10)
fc2 %>% autoplot() + autolayer(aus_airpassengers_1970_2011)
fc %>% autoplot() + autolayer(aus_airpassengers_1970_2011)
The ARIMA(2,1,2) looks similar to the model from part a. It seems to follow the overall trend of the data better than the model from part c. Removing the constant results in an error that indicates non-stationary data. As such, a model cannot be fitted when the constant is removed. Changing the differencing order to 2 allows it to be plotted with no constant.
fit3 <- aus_airpassengers_1970_2011 %>%
model(ARIMA(Passengers ~ pdq(2,1,2)))
gg_tsresiduals(fit3)
report(fit3)
## Series: Passengers
## Model: ARIMA(2,1,2) w/ drift
##
## Coefficients:
## ar1 ar2 ma1 ma2 constant
## 1.4694 -0.5103 -1.5736 0.6780 0.0650
## s.e. 0.3780 0.3558 0.3081 0.2688 0.0294
##
## sigma^2 estimated as 4.748: log likelihood=-87.74
## AIC=187.47 AICc=189.94 BIC=197.75
fc3 <- forecast(fit3, h = 10)
fc3 %>% autoplot() +
autolayer(aus_airpassengers_1970_2011)
fc %>% autoplot() +
autolayer(aus_airpassengers_1970_2011)
fc2 %>% autoplot() +
autolayer(aus_airpassengers_1970_2011)
fit4 <- aus_airpassengers_1970_2011 %>%
model(ARIMA(Passengers ~ 0 + pdq(2,1,2)))
fit4 <- aus_airpassengers_1970_2011 %>%
model(ARIMA(Passengers ~ 0 + pdq(2,2,2)))
fc4 <- forecast(fit4, h = 10)
fc4 %>% autoplot() +
autolayer(aus_airpassengers_1970_2011)
When plotting forecasts from an ARIMA(0,2,1) model with a constant, a warning message appears that indicates the model specification induces a quadratic or higher order polynomial trend. The text had indicated that constants are omitted for d > 1 by default, as quadratic or higher order trends are dangerous when forecasting. However, the package does allot for it to be plotted.
fit5 <- aus_airpassengers_1970_2011 %>%
model(ARIMA(Passengers ~ 1 + pdq(0,2,1)))
report(fit5)
## Series: Passengers
## Model: ARIMA(0,2,1) w/ poly
##
## Coefficients:
## ma1 constant
## -1.0000 0.0721
## s.e. 0.0709 0.0260
##
## sigma^2 estimated as 4.086: log likelihood=-85.74
## AIC=177.48 AICc=178.15 BIC=182.55
fc5 <- forecast(fit5, h = 10)
fc5 %>% autoplot() +
autolayer(aus_airpassengers_1970_2011)
A Box-Cox transformation was performed with lambda = 0.28.
US_GDP <- global_economy %>%
filter(Country == 'United States') %>%
select(GDP)
US_GDP %>% autoplot()
lambda_guerrero <- US_GDP %>% features(GDP, features = guerrero)
lambda_guerrero <- lambda_guerrero$lambda_guerrero
US_GDP_updated <- US_GDP %>%
mutate(box_cox = box_cox(GDP,lambda_guerrero)) %>%
select(box_cox)
US_GDP_updated %>% autoplot()
I used the ARIMA() function which selected an ARIMA(1,1,0) model with drift.
fit <- US_GDP_updated %>% model(ARIMA(box_cox))
report(fit)
## Series: box_cox
## Model: ARIMA(1,1,0) w/ drift
##
## Coefficients:
## ar1 constant
## 0.4586 118.1822
## s.e. 0.1198 9.5047
##
## sigma^2 estimated as 5479: log likelihood=-325.32
## AIC=656.65 AICc=657.1 BIC=662.78
gg_tsresiduals(fit)
After differencing the data once, I can see that variance is still not quite constant. As such, I can difference the data a second time and observe that an AR(1) model (via the one spike at lag 1 in the PACF) might be an alternative model (ARIMA(1,2,0)). However, AICc cannot be compared with the original model in this manner due to the different level of differencing. To compare against a model where AICc can be used, I can look at an ARIMA(2,1,0) model and an ARIMA(1,1,1) model.
US_GDP_updated %>%
gg_tsdisplay(box_cox,plot_type = "partial")
US_GDP_updated_diff1 <- US_GDP_updated %>%
mutate(diff = difference(box_cox))
US_GDP_updated_diff1 %>%
gg_tsdisplay(diff,plot_type = "partial")
US_GDP_updated_diff2 <- US_GDP_updated %>%
mutate(diff = difference(difference(box_cox)))
US_GDP_updated_diff2 %>%
gg_tsdisplay(diff,plot_type = "partial")
fit2 <- US_GDP_updated %>%
model(ARIMA(box_cox ~ pdq(1,2,0)))
fit3 <- US_GDP_updated %>%
model(ARIMA(box_cox ~ pdq(2,1,0)))
fit4 <- US_GDP_updated %>%
model(ARIMA(box_cox ~ pdq(1,1,1)))
I will look at the residuals plots to compare the selected model with ARIMA(1,2,0). For ARIMA(1,2,0), I can see one spike past the critical value, while the automatically selected model does not have any autocorrelations outside the significance lines. Both models have large p-values, indicating the residuals are similar to white noise. However, given the autocorrelations, it does seem like the automatically selected model is more optimal. For ARIMA(2,1,0) and ARIMA(1,1,1), I can see that the automatically selected model has a lower AICc and is therefore a better fit.
report(fit2)
## Series: box_cox
## Model: ARIMA(1,2,0)
##
## Coefficients:
## ar1
## -0.2732
## s.e. 0.1289
##
## sigma^2 estimated as 6780: log likelihood=-325.99
## AIC=655.99 AICc=656.22 BIC=660.04
gg_tsresiduals(fit2)
augment(fit2) %>%
features(.innov,ljung_box, lag = 10, dof = 1)
## # A tibble: 1 × 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 ARIMA(box_cox ~ pdq(1, 2, 0)) 10.9 0.283
augment(fit) %>%
features(.innov, ljung_box, lag = 10, dof = 1)
## # A tibble: 1 × 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 ARIMA(box_cox) 3.81 0.923
report(fit3)
## Series: box_cox
## Model: ARIMA(2,1,0) w/ drift
##
## Coefficients:
## ar1 ar2 constant
## 0.4554 0.0071 117.2907
## s.e. 0.1341 0.1352 9.5225
##
## sigma^2 estimated as 5580: log likelihood=-325.32
## AIC=658.64 AICc=659.41 BIC=666.82
gg_tsresiduals(fit3)
report(fit4)
## Series: box_cox
## Model: ARIMA(1,1,1) w/ drift
##
## Coefficients:
## ar1 ma1 constant
## 0.4736 -0.0190 114.8547
## s.e. 0.2851 0.3286 9.3486
##
## sigma^2 estimated as 5580: log likelihood=-325.32
## AIC=658.64 AICc=659.41 BIC=666.82
gg_tsresiduals(fit4)
All of the forecasts for the models above appear reasonable, and follow the general trend of the data.
fc <- forecast(fit, h = 5)
fc2 <- forecast(fit2, h = 5)
fc3 <- forecast(fit3, h = 5)
fc4 <- forecast(fit4, h = 5)
fc %>% autoplot() + autolayer(US_GDP_updated)
fc2 %>% autoplot() + autolayer(US_GDP_updated)
fc3 %>% autoplot() + autolayer(US_GDP_updated)
fc4 %>% autoplot() + autolayer(US_GDP_updated)
Looking at an ETS model with no transformation, I can see that the forecast also appears fairly reasonable, and follows the trend of the data. An ETS(M,A,N) model was selected. It is difficult to compare the results across the models given that one was done on a transformed dataset and the other was not. I will split the data into a training and test set, and then look at the RMSSE and MASE to compare accuracy, since both metrics are scaled errors. From this, I can see that the ARIMA model has a lower MASE and RMSSE, and appears to be a better fit.
fitETS <- US_GDP %>% model(ETS())
report(fitETS)
## Series: GDP
## Model: ETS(M,A,N)
## Smoothing parameters:
## alpha = 0.9990876
## beta = 0.5011949
##
## Initial states:
## l[0] b[0]
## 448093333334 64917355687
##
## sigma^2: 7e-04
##
## AIC AICc BIC
## 3190.787 3191.941 3201.089
gg_tsresiduals(fitETS)
fcETS <- forecast(fitETS,h=5)
fcETS %>% autoplot() + autolayer(US_GDP)
train <- US_GDP %>% filter(Year < 2013)
test <- US_GDP %>% filter(Year >= 2013)
lambda_guerrero <- train %>%
features(GDP, features = guerrero)
lambda_guerrero <- lambda_guerrero$lambda_guerrero
US_GDP_train_boxcox <- train %>%
mutate(box_cox = box_cox(GDP,lambda_guerrero)) %>%
select(box_cox)
US_GDP_test_boxcox <- test %>%
mutate(box_cox = box_cox(GDP,lambda_guerrero)) %>%
select(box_cox)
US_GDP_full_boxcox <- US_GDP %>%
mutate(box_cox = box_cox(GDP,lambda_guerrero)) %>%
select(box_cox)
fit_ARIMA <- US_GDP_train_boxcox %>% model(ARIMA(box_cox))
fit_ETS <- train %>% model(ETS())
fit_ARIMA %>% accuracy()
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 ARIMA(box_cox) Training 0.394 16.7 12.6 0.0114 0.364 0.243 0.307 -0.0321
fit_ETS %>% accuracy()
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 ETS() Training 17805046346. 1.69e11 1.03e11 0.505 2.03 0.331 0.450 0.187
forecast_ARIMA <- forecast(fit_ARIMA, h = 5)
forecast_ETS <- forecast(fit_ETS, h = 5)
forecast_ARIMA %>% accuracy(US_GDP_full_boxcox)
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 ARIMA(box_cox) Test -21.7 24.5 21.7 -0.432 0.432 0.418 0.451 0.452
forecast_ETS %>% accuracy(US_GDP)
## # A tibble: 1 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 ETS() Test 291448950270. 345234202897. 2.91e11 1.57 1.57 0.935 0.920 0.191
forecast_ARIMA %>% autoplot() + autolayer(US_GDP_full_boxcox)
forecast_ETS %>% autoplot() + autolayer(US_GDP)