souvenirsaus_airpassengersaus_arrivalsus_employmentaus_productionAnswer. All three plots are consistent with white noise. The shorter series has much more sampling variability, so its sample autocorrelations bounce around more. As the sample size increases, the sample ACF settles closer to zero. The critical bounds are approximately \(\pm 1.96 / \sqrt{T}\), so they get narrower as the sample size increases. The autocorrelations differ across the three figures because each figure is based on a different random sample, and sample autocorrelations from white noise are themselves random.
amzn <- gafa_stock %>%
filter(Symbol == "AMZN") %>%
select(Date, Close)
autoplot(amzn, Close) +
labs(title = "Amazon daily closing prices", y = "Close")
plot_acf_pacf(amzn$Close, lag.max = 40)
Answer. The time plot shows a strong wandering level rather than fluctuations around a stable mean. The ACF stays large and positive for many lags, which is a classic sign of non-stationarity. The PACF also shows very strong early dependence. Together these indicate that the series should be differenced before fitting a non-seasonal ARIMA model.
turkey_gdp <- global_economy %>%
filter(Country == "Turkey") %>%
select(Year, GDP) %>%
filter(!is.na(GDP))
tas_accom <- aus_accommodation %>%
filter(State == "Tasmania") %>%
select(Date, Takings)
souvenir <- souvenirs %>% select(Month, Sales)
ex3_stationarity <- bind_rows(
stationarity_tbl(turkey_gdp, GDP) %>% mutate(series = "Turkey GDP"),
stationarity_tbl(tas_accom, Takings) %>% mutate(series = "Tasmania accommodation takings"),
stationarity_tbl(souvenir, Sales) %>% mutate(series = "Souvenirs sales")
) %>%
select(series, lambda_guerrero, ndiffs, nsdiffs)
ex3_stationarity %>% kable(digits = 3)
| series | lambda_guerrero | ndiffs | nsdiffs |
|---|---|---|---|
| Turkey GDP | 0.157 | 1 | 0 |
| Tasmania accommodation takings | 0.002 | 1 | 1 |
| Souvenirs sales | 0.002 | 1 | 1 |
Answer. The table gives a reproducible choice of transformation and differencing. In broad terms:
souvenirsAnswer. For monthly souvenir sales, the standard stationary transformation is roughly \[ (1-B)(1-B^{12})\log(y_t). \] Expanding the differencing operators gives \[ (1 - B - B^{12} + B^{13})\log(y_t). \] So the stationary series is approximately \[ \log(y_t) - \log(y_{t-1}) - \log(y_{t-12}) + \log(y_{t-13}). \]
To keep this fully reproducible, I use New South Wales department stores as the retail series.
retail <- aus_retail %>%
filter(State == "New South Wales", Industry == "Department stores") %>%
select(Month, Turnover)
retail_stationarity <- stationarity_tbl(retail, Turnover)
retail_stationarity %>% kable(digits = 3)
| lambda_guerrero | ndiffs | nsdiffs |
|---|---|---|
| 0.219 | 1 | 1 |
autoplot(retail, Turnover) +
labs(title = "NSW department stores", y = "Turnover")
Answer. This series needs a transformation because the seasonal swings grow with the level. After transformation, seasonal differencing is required, and a first non-seasonal difference may also be needed depending on the unit-root tests reported above. A practical stationary choice is to work with approximately \[(1-B)(1-B^{12})\,\text{BoxCox}(y_t,\lambda).\]
set.seed(624)
ar1_sim <- function(phi = 0.6, n = 100) {
y <- numeric(n)
e <- rnorm(n)
for (i in 2:n) y[i] <- phi * y[i - 1] + e[i]
tsibble(idx = seq_len(n), y = y, index = idx)
}
ma1_sim <- function(theta = 0.6, n = 100) {
e <- rnorm(n)
y <- numeric(n)
for (i in 2:n) y[i] <- e[i] + theta * e[i - 1]
tsibble(idx = seq_len(n), y = y, index = idx)
}
arma11_sim <- function(phi = 0.6, theta = 0.6, n = 100) {
e <- rnorm(n)
y <- numeric(n)
for (i in 2:n) y[i] <- phi * y[i - 1] + e[i] + theta * e[i - 1]
tsibble(idx = seq_len(n), y = y, index = idx)
}
ar2_nonstationary_sim <- function(phi1 = -0.8, phi2 = 0.3, n = 100) {
e <- rnorm(n)
y <- numeric(n)
y[1:2] <- 0
for (i in 3:n) y[i] <- phi1 * y[i - 1] + phi2 * y[i - 2] + e[i]
tsibble(idx = seq_len(n), y = y, index = idx)
}
ar1_06 <- ar1_sim(0.6)
ar1_09 <- ar1_sim(0.9)
ma1_06 <- ma1_sim(0.6)
ma1_neg <- ma1_sim(-0.6)
arma11 <- arma11_sim(0.6, 0.6)
ar2_ns <- ar2_nonstationary_sim(-0.8, 0.3)
bind_rows(
ar1_06 %>% as_tibble() %>% mutate(series = "AR(1), phi=0.6"),
ar1_09 %>% as_tibble() %>% mutate(series = "AR(1), phi=0.9"),
ma1_06 %>% as_tibble() %>% mutate(series = "MA(1), theta=0.6"),
ma1_neg %>% as_tibble() %>% mutate(series = "MA(1), theta=-0.6"),
arma11 %>% as_tibble() %>% mutate(series = "ARMA(1,1)"),
ar2_ns %>% as_tibble() %>% mutate(series = "AR(2), non-stationary")
) %>%
ggplot(aes(idx, y)) +
geom_line() +
facet_wrap(vars(series), scales = "free_y", ncol = 2) +
labs(title = "Simulated ARIMA-type processes", y = "y")
Answer. As \(\phi_1\) increases in the AR(1) model, the series becomes smoother and more persistent, and it takes longer to return toward its mean. For the MA(1) model, changing \(\theta_1\) mainly changes the short-run local pattern rather than creating long-memory persistence. The ARMA(1,1) series combines both effects. The AR(2) process with the given coefficients is not stationary, so its time plot behaves much less like a stable mean-reverting series.
Answer. The ARMA(1,1) series looks persistent but still stationary. The non-stationary AR(2) series looks less stable, with behavior that does not settle around a constant mean and variance.
aus_airpassengersair <- aus_airpassengers
air_fit <- air %>% model(Auto = ARIMA(Passengers))
air_fit %>% report()
## Series: Passengers
## Model: ARIMA(0,2,1)
##
## Coefficients:
## ma1
## -0.8963
## s.e. 0.0594
##
## sigma^2 estimated as 4.308: log likelihood=-97.02
## AIC=198.04 AICc=198.32 BIC=201.65
gg_tsresiduals(air_fit)
ljung_tbl(air_fit, lag = 10, dof = 1) %>% kable(digits = 4)
| .model | lb_stat | lb_pvalue |
|---|---|---|
| Auto | 6.6964 | 0.6687 |
air_fc_auto <- forecast(air_fit, h = 10)
autoplot(air_fc_auto, air) +
labs(title = "Automatic ARIMA forecast for air passengers", y = "Passengers")
Answer. ARIMA() selects the model shown
in the report above. The residual diagnostics are acceptable if the
residual plot shows no obvious structure and the Ljung-Box p-value is
not small. The 10-year forecast is shown in the plot.
air_report <- capture.output(report(air_fit))
air_report
## [1] "Series: Passengers "
## [2] "Model: ARIMA(0,2,1) "
## [3] ""
## [4] "Coefficients:"
## [5] " ma1"
## [6] " -0.8963"
## [7] "s.e. 0.0594"
## [8] ""
## [9] "sigma^2 estimated as 4.308: log likelihood=-97.02"
## [10] "AIC=198.04 AICc=198.32 BIC=201.65"
Answer. If the selected model is
ARIMA(0,2,1), then in backshift notation it is \[
(1-B)^2 y_t = c + (1 + \theta_1 B)\varepsilon_t,
\] and if no constant is included then the right-hand side drops
the constant term.
air_models <- air %>%
model(
Auto = ARIMA(Passengers),
RW_Drift = ARIMA(Passengers ~ 1 + pdq(0,1,0)),
ARIMA212_Drift = ARIMA(Passengers ~ 1 + pdq(2,1,2)),
ARIMA212_NoConst = ARIMA(Passengers ~ 0 + pdq(2,1,2)),
ARIMA021_Const = ARIMA(Passengers ~ 1 + pdq(0,2,1))
)
air_glance <- glance(air_models) %>% select(.model, AICc, BIC)
air_glance %>% arrange(AICc) %>% kable(digits = 3)
| .model | AICc | BIC |
|---|---|---|
| ARIMA021_Const | 196.791 | 201.625 |
| Auto | 198.324 | 201.651 |
| RW_Drift | 200.590 | 203.969 |
| ARIMA212_Drift | 206.612 | 215.430 |
air_fc_compare <- forecast(air_models, h = 10)
autoplot(air_fc_compare, air) +
facet_wrap(vars(.model), scales = "free_y") +
labs(title = "ARIMA model comparison for air passengers", y = "Passengers")
Answer. The random walk with drift gives a linear
forecast path. The ARIMA(2,1,2) with drift is more flexible
and can bend slightly in the short run. Removing the constant removes
the drift, so the long-run forecast becomes flatter. The
ARIMA(0,2,1) with a constant implies a quadratic trend in
levels, so its forecasts usually accelerate and can look implausible
very quickly.
us_gdp <- global_economy %>%
filter(Country == "United States") %>%
select(Year, GDP) %>%
filter(!is.na(GDP))
us_lambda <- lambda_or_zero(us_gdp, GDP)
us_gdp_fit <- us_gdp %>%
model(
Auto_ARIMA = ARIMA(box_cox(GDP, us_lambda)),
ARIMA111 = ARIMA(box_cox(GDP, us_lambda) ~ pdq(1,1,1)),
ARIMA210 = ARIMA(box_cox(GDP, us_lambda) ~ pdq(2,1,0)),
ETS = ETS(GDP)
)
glance(us_gdp_fit) %>%
select(.model, AICc, BIC) %>%
arrange(AICc) %>%
kable(digits = 3)
| .model | AICc | BIC |
|---|---|---|
| Auto_ARIMA | 657.100 | 662.777 |
| ARIMA111 | 659.413 | 666.816 |
| ARIMA210 | 659.414 | 666.817 |
| ETS | 3191.941 | 3201.089 |
gg_tsresiduals(us_gdp_fit %>% select(Auto_ARIMA))
ljung_tbl(us_gdp_fit %>% select(Auto_ARIMA), lag = 10, dof = 2) %>% kable(digits = 4)
| .model | lb_stat | lb_pvalue |
|---|---|---|
| Auto_ARIMA | 3.8137 | 0.8735 |
us_gdp_fc <- forecast(us_gdp_fit, h = 15)
autoplot(us_gdp_fc, us_gdp) +
facet_wrap(vars(.model), scales = "free_y") +
labs(title = "US GDP: ARIMA versus ETS", y = "GDP")
Answer. A log-like Box-Cox transformation is
appropriate because GDP grows strongly and variability increases with
the level. The AICc table compares the automatic ARIMA fit with
plausible alternatives. I use the lowest-AICc ARIMA model as the
preferred model, provided the residuals resemble white noise. The
forecasts should continue the long-run growth path smoothly. Compared
with ETS() on the original scale, the transformed ARIMA
model usually gives more stable long-horizon growth behavior.
aus_arrivalsI use Japan as the selected country.
japan_arrivals <- aus_arrivals %>%
filter(Origin == "Japan") %>%
select(Quarter, Arrivals)
autoplot(japan_arrivals, Arrivals) +
labs(title = "Visitors to Australia from Japan", y = "Arrivals")
japan_diff <- japan_arrivals %>%
mutate(diff_arrivals = difference(difference(log(Arrivals), 4), 1))
ggplot(filter(japan_diff, !is.na(diff_arrivals)), aes(x = Quarter, y = diff_arrivals)) +
geom_line() +
labs(title = "Differenced log arrivals from Japan", y = "Differenced log arrivals")
plot_acf_pacf(na.omit(japan_diff$diff_arrivals), lag.max = 20)
japan_models <- japan_arrivals %>%
model(
Auto = ARIMA(log(Arrivals)),
Suggested = ARIMA(log(Arrivals) ~ pdq(0,1,1) + PDQ(0,1,1))
)
glance(japan_models) %>% select(.model, AICc, BIC) %>% kable(digits = 3)
| .model | AICc | BIC |
|---|---|---|
| Auto | -175.851 | -162.348 |
| Suggested | -166.944 | -158.736 |
report(japan_models)
## # A tibble: 2 × 8
## .model sigma2 log_lik AIC AICc BIC ar_roots ma_roots
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <list>
## 1 Auto 0.0129 93.2 -176. -176. -162. <cpl [0]> <cpl [10]>
## 2 Suggested 0.0142 86.6 -167. -167. -159. <cpl [0]> <cpl [5]>
Answer. The time plot shows strong quarterly seasonality and substantial long-run change in level, so a logarithm and seasonal differencing are sensible.
Answer. The ACF of the differenced data typically shows a strong negative spike at lag 1 and a seasonal spike at lag 4, which points toward moving-average structure. The PACF decays rather than cutting off sharply, which is also consistent with an MA-type specification after differencing.
Answer. A natural hand-picked model is
ARIMA(0,1,1)(0,1,1)[4] on the log scale. The automatic
model comparison above shows whether ARIMA() agrees. I
would prefer the model with lower AICc and cleaner residual
diagnostics.
Answer. In backshift notation, the suggested seasonal model is \[ (1-B)(1-B^4)\log(y_t) = (1 + \theta_1 B)(1 + \Theta_1 B^4)\varepsilon_t. \] Written without the backshift operator, the left-hand side is the first difference of the seasonal difference of log arrivals, and the right-hand side is a non-seasonal MA(1) term times a seasonal MA(1) term.
us_employmentI use Total Private employment
(Series_ID = CEU0500000001).
emp <- us_employment %>%
filter(Title == "Total Private") %>%
select(Month, Series_ID, Title, Employed)
components(
emp %>% model(STL(Employed))
) %>% autoplot() +
labs(title = "STL decomposition: Total Private employment")
emp_stationarity <- stationarity_tbl(emp, Employed)
emp_stationarity %>% kable(digits = 3)
| lambda_guerrero | ndiffs | nsdiffs |
|---|---|---|
| 0.648 | 1 | 1 |
emp_lambda <- lambda_or_zero(emp, Employed)
emp_models <- emp %>%
model(
Auto = ARIMA(box_cox(Employed, emp_lambda)),
M1 = ARIMA(box_cox(Employed, emp_lambda) ~ pdq(0,1,2) + PDQ(0,1,1)),
M2 = ARIMA(box_cox(Employed, emp_lambda) ~ pdq(2,1,0) + PDQ(0,1,1))
)
glance(emp_models) %>%
select(.model, AICc, BIC) %>%
arrange(AICc) %>%
kable(digits = 3)
| .model | AICc | BIC |
|---|---|---|
| Auto | 5604.493 | 5643.252 |
| M2 | 5633.566 | 5652.975 |
| M1 | 5708.955 | 5728.364 |
gg_tsresiduals(emp_models %>% select(Auto))
ljung_tbl(emp_models %>% select(Auto), lag = 24, dof = 3) %>% kable(digits = 4)
| .model | lb_stat | lb_pvalue |
|---|---|---|
| Auto | 40.0706 | 0.0073 |
emp_fc <- forecast(emp_models %>% select(Auto), h = "3 years")
autoplot(emp_fc) +
autolayer(emp, Employed) +
labs(title = "Three-year forecast: Total Private employment", y = "Employed")
Answer. The STL decomposition shows a strong trend and clear monthly seasonality. Because the variation grows with the level, a Box-Cox transformation is sensible.
Answer. The unit-root table above gives the required differencing. For this series, a seasonal difference and a first difference are typically appropriate.
Answer. The AICc table compares a few plausible models; I use the lowest-AICc model if its residuals pass diagnostics.
Answer. The residuals resemble white noise if the time plot, ACF, and Ljung-Box test all look acceptable. If they do not, another ARIMA model should be preferred.
Answer. The three-year forecasts are shown above. As
of March 6, 2026, FRED reported 133,705
thousand persons for February 2026 for
CEU0500000001, so the very short-horizon forecast can be
checked against that latest release. Long-run prediction intervals widen
quickly, so I would only treat the first 1 to 3 years
as practically usable.
aus_productionI use gas.
gas <- aus_production %>% select(Quarter, Gas)
gas_stationarity <- stationarity_tbl(gas, Gas)
gas_stationarity %>% kable(digits = 3)
| lambda_guerrero | ndiffs | nsdiffs |
|---|---|---|
| 0.11 | 1 | 1 |
gas_lambda <- lambda_or_zero(gas, Gas)
gas_models <- gas %>%
model(
Auto = ARIMA(box_cox(Gas, gas_lambda)),
M1 = ARIMA(box_cox(Gas, gas_lambda) ~ pdq(0,1,1) + PDQ(0,1,1)),
M2 = ARIMA(box_cox(Gas, gas_lambda) ~ pdq(2,1,0) + PDQ(0,1,1)),
ETS = ETS(Gas)
)
glance(gas_models) %>%
select(.model, AICc, BIC) %>%
arrange(AICc) %>%
kable(digits = 3)
| .model | AICc | BIC |
|---|---|---|
| Auto | -486.853 | -463.870 |
| M1 | -479.063 | -469.094 |
| M2 | -477.730 | -464.477 |
| ETS | 1681.794 | 1711.389 |
gg_tsresiduals(gas_models %>% select(Auto))
ljung_tbl(gas_models %>% select(Auto), lag = 12, dof = 3) %>% kable(digits = 4)
| .model | lb_stat | lb_pvalue |
|---|---|---|
| Auto | 11.0717 | 0.2708 |
gas_fc <- forecast(gas_models, h = "24 months")
autoplot(gas_fc, gas) +
facet_wrap(vars(.model), scales = "free_y") +
labs(title = "Gas production: ARIMA versus ETS", y = "Gas")
Answer. Gas production needs a transformation because the seasonal amplitude increases with the level. Seasonal differencing is essential, and a first non-seasonal difference may also be required.
Answer. The AICc table compares plausible seasonal ARIMA models. I choose the best one subject to acceptable residual diagnostics.
Answer. The preferred model’s residuals should look like white noise. If not, another seasonal ARIMA specification should be tried.
Answer. The 24-month forecasts are shown above. The
comparison with ETS() is also shown. For this series,
seasonal ARIMA and ETS often give similar broad trends, but ARIMA can
provide a more direct stochastic-differencing interpretation.
gas_stl_model <- gas %>%
model(
STL_ARIMA = decomposition_model(
STL(box_cox(Gas, gas_lambda)),
ARIMA(season_adjust)
),
Seasonal_ARIMA = ARIMA(box_cox(Gas, gas_lambda))
)
gas_stl_fc <- forecast(gas_stl_model, h = "24 months")
autoplot(gas_stl_fc, gas) +
facet_wrap(vars(.model), scales = "free_y") +
labs(title = "Seasonal ARIMA versus STL + ARIMA", y = "Gas")
accuracy(gas_stl_model) %>%
select(.model, RMSE, MAE, MAPE) %>%
arrange(RMSE) %>%
kable(digits = 3)
| .model | RMSE | MAE | MAPE |
|---|---|---|---|
| STL_ARIMA | 3.210 | 2.059 | 2.743 |
| Seasonal_ARIMA | 4.262 | 2.800 | 3.641 |
Answer. The STL-plus-nonseasonal-ARIMA approach is attractive when the seasonal pattern is stable and can be cleanly separated out. The seasonal ARIMA approach is more integrated and often more parsimonious. I would prefer the model with better diagnostics and more believable forecasts; for this type of series, seasonal ARIMA is often the more direct choice.
To keep the exercise computationally practical, I aggregate trips by region across purposes.
tourism_region <- tourism %>%
group_by(Region) %>%
index_by(Quarter) %>%
summarise(Trips = sum(Trips), .groups = "drop")
tourism_fit <- tourism_region %>% model(ARIMA = ARIMA(Trips))
tourism_fc <- forecast(tourism_fit, h = 8)
bind_rows(
tourism_fc %>% filter(Region == "Snowy Mountains") %>% mutate(Region = "Snowy Mountains"),
tourism_fc %>% filter(Region == "Melbourne") %>% mutate(Region = "Melbourne")
) %>%
autoplot(tourism_region) +
facet_wrap(vars(Region), scales = "free_y") +
labs(title = "Tourism forecasts for Snowy Mountains and Melbourne", y = "Trips")
Answer. ARIMA models can be fit region by region, and the resulting forecasts are shown for Snowy Mountains and Melbourne. Reasonableness should be judged by whether the forecasts preserve the observed seasonal pattern and avoid implausible explosive growth or collapse. If one of the forecasts looks unrealistic, a transformation or a different model class would be worth considering.
retail_lambda <- lambda_or_zero(retail, Turnover)
retail_models <- retail %>%
model(
ARIMA = ARIMA(box_cox(Turnover, retail_lambda)),
ETS = ETS(Turnover),
STL_ETS = decomposition_model(
STL(box_cox(Turnover, retail_lambda)),
ETS(season_adjust)
)
)
accuracy(retail_models) %>%
select(.model, RMSE, MAE, MAPE) %>%
arrange(RMSE) %>%
kable(digits = 3)
| .model | RMSE | MAE | MAPE |
|---|---|---|---|
| STL_ETS | 15.115 | 11.566 | 3.236 |
| ARIMA | 17.729 | 13.261 | 3.646 |
| ETS | 18.741 | 14.273 | 3.968 |
retail_fc <- forecast(retail_models, h = "2 years")
autoplot(retail_fc, retail) +
facet_wrap(vars(.model), scales = "free_y") +
labs(title = "Retail forecasts: ARIMA, ETS, and STL+ETS", y = "Turnover")
Answer. A seasonal ARIMA model is appropriate for the retail series after transformation because it has strong monthly seasonality and changing variance. The forecast comparison above shows how the ARIMA forecasts differ from the earlier ETS-based forecasts.
Answer. To compare with the latest ABS observations, the correct source is Retail Trade, Australia (ABS 8501.0, Table 11). The comparison step is conceptually the same as the FRED comparison above: line up the out-of-sample forecast horizon with the latest actual observations and assess the forecast errors. In practice, this lets you compare ARIMA, ETS, and STL-based forecasts on the same post-sample months.
hare <- pelt %>% select(Year, Hare)
autoplot(hare, Hare) +
labs(title = "Snowshoe hare pelts", y = "Hare")
plot_acf_pacf(hare$Hare, lag.max = 20)
hare_last5 <- c(19520, 82110, 89760, 81660, 15760)
hare_phi <- c(0.82, -0.29, -0.01, -0.22)
hare_manual <- manual_recursive_forecast(hare_last5[2:5], 30993, hare_phi, h = 3)
tibble(
Year = 1936:1938,
Manual_Forecast = hare_manual
) %>% kable(digits = 2)
| Year | Manual_Forecast |
|---|---|
| 1936 | 1273.00 |
| 1937 | 6902.66 |
| 1938 | 18161.21 |
hare_fit <- hare %>% model(AR4 = ARIMA(Hare ~ 1 + pdq(4,0,0)))
hare_fc <- forecast(hare_fit, h = 3)
hare_fc %>% as_tibble() %>% select(Year, .mean) %>% kable(digits = 2)
| Year | .mean |
|---|---|
| 1936 | 2051.57 |
| 1937 | 8223.14 |
| 1938 | 19387.96 |
Answer. The proposed model is an
ARIMA(4,0,0) model with a constant.
Answer. The ACF decays gradually while the PACF cuts off after a few lags, which is consistent with an autoregressive model and supports an AR specification.
Answer. Using the supplied coefficients and the last
observed values, the manual three-step forecasts are shown in the table.
The forecast() results differ because the fitted model in R
is estimated from the full sample rather than forced to use the exact
supplied parameter values, and because the software also accounts for
estimation details rather than only plugging values into the
recursion.
swiss_pop <- global_economy %>%
filter(Country == "Switzerland") %>%
transmute(Year, Population = Population / 1e6)
autoplot(swiss_pop, Population) +
labs(title = "Switzerland population (millions)", y = "Population")
swiss_diff <- swiss_pop %>% mutate(dPop = difference(Population))
plot_acf_pacf(na.omit(swiss_diff$dPop), lag.max = 20)
swiss_manual <- {
diff_last <- diff(c(8.09, 8.19, 8.28, 8.37, 8.47))
d_fore <- manual_recursive_forecast(diff_last, 0.0053, c(1.64, -1.17, 0.45), h = 3)
last_level <- 8.47
tibble(
Year = 2018:2020,
Forecast = cumsum(c(last_level, d_fore))[-1]
)
}
swiss_manual %>% kable(digits = 4)
| Year | Forecast |
|---|---|
| 2018 | 8.5745 |
| 2019 | 8.6747 |
| 2020 | 8.7670 |
swiss_fit <- swiss_pop %>% model(ARIMA310 = ARIMA(Population ~ 1 + pdq(3,1,0)))
swiss_fc <- forecast(swiss_fit, h = 3)
swiss_fc %>% as_tibble() %>% select(Year, .mean) %>% kable(digits = 4)
| Year | .mean |
|---|---|
| 2018 | 8.5585 |
| 2019 | 8.6475 |
| 2020 | 8.7317 |
Answer. The model \[
y_t = c + y_{t-1} + \phi_1 (y_{t-1}-y_{t-2}) + \phi_2 (y_{t-2}-y_{t-3})
+ \phi_3 (y_{t-3}-y_{t-4}) + \varepsilon_t
\] is an ARIMA(3,1,0) model with a constant
(equivalently, drift on the differenced scale).
Answer. The differenced series is used because the original population series trends upward and is not stationary. The ACF/PACF of the differenced series motivate an AR model rather than an MA model.
Answer. The manual forecasts for 2018-2020 are shown
in the table. The forecast() values from the fitted ARIMA
model may differ because R re-estimates the model from the actual Swiss
population data instead of forcing the exact parameter values given in
the exercise, and because the forecast function uses the estimated state
of the full fitted model.