library(fpp3)
library(ggplot2)
set.seed(69)Module 8 Discussion
Part I: ARIMA Model Fitting
egypt <- global_economy |>
filter(Country == "Egypt, Arab Rep.") |>
select(Year, Exports)
egypt |> autoplot(Exports) +
labs(title = "Egyptian Exports (% of GDP)", y = "% of GDP")The time series shows considerable fluctuation over time without a strong linear trend, but there are extended periods of increase and decrease, suggesting possible non-stationarity.
ACF and PACF Analysis
egypt |> features(Exports, unitroot_kpss)# A tibble: 1 × 2
kpss_stat kpss_pvalue
<dbl> <dbl>
1 0.192 0.1
egypt |> features(Exports, unitroot_ndiffs)# A tibble: 1 × 1
ndiffs
<int>
1 0
egypt |> ACF(Exports) |> autoplot() + labs(title = "ACF of Exports")egypt |> PACF(Exports) |> autoplot() + labs(title = "PACF of Exports")The ACF of the original series shows a slow, gradual decay — a hallmark of a non-stationary or strongly autoregressive series. The PACF shows a large significant spike at lag 1, with subsequent lags mostly within the significance bounds.
egypt |> ACF(difference(Exports)) |> autoplot() +
labs(title = "ACF of Differenced Exports")egypt |> PACF(difference(Exports)) |> autoplot() +
labs(title = "PACF of Differenced Exports")After differencing, the ACF and PACF help us identify the MA and AR orders respectively:
- If the ACF cuts off sharply after lag \(q\), this suggests an MA(\(q\)) component.
- If the PACF cuts off sharply after lag \(p\), this suggests an AR(\(p\)) component.
Based on the patterns observed, we select a candidate model. Given the typical behavior of this series (ACF with possible cutoff near lag 1-2, PACF with possible cutoff near lag 1), reasonable candidates include ARIMA(0,1,1), ARIMA(1,1,0), or ARIMA(1,1,1).
Manual vs Auto ARIMA
fit <- egypt |>
model(
manual_111 = ARIMA(Exports ~ pdq(1,1,1)),
manual_011 = ARIMA(Exports ~ pdq(0,1,1)),
manual_110 = ARIMA(Exports ~ pdq(1,1,0)),
auto = ARIMA(Exports)
)
fit# A mable: 1 x 4
manual_111 manual_011 manual_110 auto
<model> <model> <model> <model>
1 <ARIMA(1,1,1)> <ARIMA(0,1,1)> <ARIMA(1,1,0)> <ARIMA(2,0,1) w/ mean>
glance(fit) |> select(.model, AICc, BIC)# A tibble: 4 × 3
.model AICc BIC
<chr> <dbl> <dbl>
1 manual_111 298. 304.
2 manual_011 296. 300.
3 manual_110 296. 300.
4 auto 294. 303.
The model with the lowest AICc is preferred.
fit |> select(auto) |> report()Series: Exports
Model: ARIMA(2,0,1) w/ mean
Coefficients:
ar1 ar2 ma1 constant
1.6764 -0.8034 -0.6896 2.5623
s.e. 0.1111 0.0928 0.1492 0.1161
sigma^2 estimated as 8.046: log likelihood=-141.57
AIC=293.13 AICc=294.29 BIC=303.43
Reconciliation: The auto algorithm selected an ARIMA(2,0,1) model with a non-zero mean. This differs from our manual candidates, which all applied a first difference (\(d=1\)). The automatic selection suggests that while the original ACF decayed slowly (often a sign of non-stationarity), the algorithm found that a second-order autoregressive process (\(p=2\)) along with an MA(1) term was sufficient to capture the high persistence in the data without strictly requiring differencing. By optimizing for the lowest AICc, the algorithm determined that modeling this persistent AR(2) process with a mean provided a more parsimonious and better-fitting specification than forcing a strict \(d=1\) difference.
Forecasting
fc <- fit |> select(auto) |> forecast(h = 5)
fc |>
autoplot(egypt, level = 95) +
labs(title = "Egyptian Exports: Forecast with 95% Prediction Interval",
y = "% of GDP")The black line shows historical values, the blue line shows point forecasts, and the shaded region is the 95% prediction interval. The interval widens over the forecast horizon because uncertainty accumulates — each future step depends on the previous unknown values plus a new shock.
Part II: Seasonal ARIMA
myseries <- aus_retail |>
filter(
State == "New South Wales",
Industry == "Department stores"
) |>
select(Month, Turnover)
myseries |> autoplot(Turnover) +
labs(title = "NSW Department Store Turnover", y = "Turnover (A$ Million)")The plot shows a clear upward trend and recurring seasonal peaks (typically around December due to holiday shopping), confirming this data is well-suited for a Seasonal ARIMA model.
Fit a Seasonal ARIMA Model
fit_seasonal <- myseries |>
model(sarima = ARIMA(Turnover))
fit_seasonal |> report()Series: Turnover
Model: ARIMA(2,1,1)(1,1,2)[12]
Coefficients:
ar1 ar2 ma1 sar1 sma1 sma2
-0.3126 -0.1833 -0.7290 -0.6271 0.2332 -0.3270
s.e. 0.0682 0.0609 0.0523 0.2523 0.2483 0.0968
sigma^2 estimated as 370.4: log likelihood=-1872.01
AIC=3758.02 AICc=3758.29 BIC=3786.43
The report() output shows the selected ARIMA(p,d,q)(P,D,Q)[12] specification. The “[12]” denotes a seasonal period of 12 months. Let’s examine each component:
- d (non-seasonal differencing): Removes the overall trend. A first difference (\(d=1\)) transforms the data so the mean level is stable over time.
- D (seasonal differencing): Removes the repeating seasonal pattern. A seasonal difference (\(D=1\)) subtracts the value from the same month last year, eliminating the annual cycle.
- p (non-seasonal AR order): The AR coefficients capture how much the current value depends on recent past values (at lags 1, 2, …, \(p\)). A positive AR1 coefficient means high values tend to follow high values in the short run.
- q (non-seasonal MA order): The MA coefficients capture how much recent forecast errors influence the current value. They model the short-run adjustment to shocks.
- P (seasonal AR order): Captures year-over-year autoregressive dependence. For example, a seasonal AR1 coefficient tells us how much this January depends on last January (after differencing).
- Q (seasonal MA order): Captures how seasonal forecast errors propagate. A seasonal MA1 coefficient indicates that a shock felt in, say, last December still influences this December.
fit_seasonal |> tidy()# A tibble: 6 × 6
.model term estimate std.error statistic p.value
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 sarima ar1 -0.313 0.0682 -4.58 5.97e- 6
2 sarima ar2 -0.183 0.0609 -3.01 2.76e- 3
3 sarima ma1 -0.729 0.0523 -13.9 1.28e-36
4 sarima sar1 -0.627 0.252 -2.49 1.33e- 2
5 sarima sma1 0.233 0.248 0.939 3.48e- 1
6 sarima sma2 -0.327 0.0968 -3.38 7.98e- 4
The tidy() output provides the statistical significance of our model’s components. Looking at the p-values, almost all coefficients (ar1, ar2, ma1, sar1, and sma2) are highly significant (p-value < 0.05). However, it is interesting to note that the sma1 (Seasonal MA 1) coefficient has a p-value of approximately 0.348, indicating it is not statistically significant at the 5% level. This suggests that while the auto.arima algorithm included it to minimize the overall AICc, this specific first-order seasonal shock term doesn’t strongly drive the model compared to the second-order seasonal shock (sma2) and the seasonal autoregressive term (sar1). The negative non-seasonal AR coefficients (ar1 and ar2) suggest an oscillating, mean-reverting behavior in the short-term differences.
Forecast
fit_seasonal |> forecast(h = 24) |>
autoplot(myseries |> filter(year(Month) >= 2010)) +
labs(title = "Seasonal ARIMA Forecast: NSW Department Stores",
y = "Turnover (A$ Million)")The forecast reproduces the seasonal pattern (peaks in December, troughs in February) and continues the overall trend. The prediction intervals widen over the 2-year horizon as uncertainty grows.
Part III: White Noise, Random Walk, and Random Walk with Drift
Equations and Assumptions
White Noise (WN)
\[Y_t = \varepsilon_t, \quad \varepsilon_t \sim \text{iid } N(0, \sigma^2)\]
- Each observation is an independent, identically distributed random draw.
- No dependence on the past whatsoever.
- \(E[Y_t] = 0\), \(\text{Var}(Y_t) = \sigma^2\) — both constant over time.
Random Walk (RW)
\[Y_t = Y_{t-1} + \varepsilon_t, \quad \varepsilon_t \sim \text{iid } N(0, \sigma^2)\]
- Today’s value is yesterday’s value plus a new shock.
- By recursive substitution (assuming \(Y_0 = 0\)): \(Y_t = \sum_{i=1}^{t} \varepsilon_i\) — a cumulative sum of shocks.
- \(E[Y_t] = 0\), but \(\text{Var}(Y_t) = t\sigma^2\) — variance grows linearly with time.
Random Walk with Drift (RW + Drift)
\[Y_t = \delta + Y_{t-1} + \varepsilon_t, \quad \varepsilon_t \sim \text{iid } N(0, \sigma^2)\]
- Same as RW but a constant \(\delta\) is added each period.
- By recursive substitution: \(Y_t = \delta t + \sum_{i=1}^{t} \varepsilon_i\).
- \(E[Y_t] = \delta t\) — the mean drifts linearly. \(\text{Var}(Y_t) = t\sigma^2\) — variance still grows.
Relationship Through Differencing
Differencing a Random Walk gives White Noise: \[\Delta Y_t = Y_t - Y_{t-1} = (Y_{t-1} + \varepsilon_t) - Y_{t-1} = \varepsilon_t \quad \Rightarrow \quad \text{WN}\]
Differencing a Random Walk with Drift gives White Noise + constant: \[\Delta Y_t = Y_t - Y_{t-1} = (\delta + Y_{t-1} + \varepsilon_t) - Y_{t-1} = \delta + \varepsilon_t \quad \Rightarrow \quad \text{WN} + \delta\]
Conversely, a RW is the cumulative sum of WN, and a RW with drift is the cumulative sum of WN + constant. They are linked through integration (cumulation) and differencing.
Simulation and Visualization
n <- 200
sigma <- 1
delta <- 0.2
eps <- rnorm(n, mean = 0, sd = sigma)
wn <- eps
rw <- cumsum(eps)
rw_drift <- cumsum(delta + eps)
sim_data <- tibble(
t = rep(1:n, 3),
value = c(wn, rw, rw_drift),
process = rep(c("White Noise", "Random Walk", "RW with Drift"), each = n)
) |>
mutate(process = factor(process, levels = c("White Noise", "Random Walk", "RW with Drift")))
ggplot(sim_data, aes(x = t, y = value)) +
geom_line(color = "steelblue") +
facet_wrap(~process, scales = "free_y", ncol = 1) +
labs(title = "Simulated WN, RW, and RW with Drift",
x = "Time", y = "Value") +
theme_minimal()What to observe in the plots:
- White Noise fluctuates rapidly around zero with no visible pattern — values at different times are unrelated.
- Random Walk wanders away from zero in an unpredictable path — it has no tendency to return to any particular level. The range it covers grows over time.
- RW with Drift shows the same wandering behavior but superimposed on a systematic upward trend (because \(\delta = 0.2 > 0\)). The drift pulls the series steadily upward on average.
Verify the Differencing Relationship
rw_diff <- diff(rw)
rw_drift_diff <- diff(rw_drift)
verify_data <- tibble(
t = rep(1:(n-1), 2),
value = c(rw_diff, rw_drift_diff),
series = rep(c("Differenced RW → WN (mean ≈ 0)",
"Differenced RW+Drift → WN + δ (mean ≈ 0.2)"), each = n-1)
)
ggplot(verify_data, aes(x = t, y = value)) +
geom_line(color = "steelblue") +
facet_wrap(~series, ncol = 1) +
geom_hline(data = tibble(
series = c("Differenced RW → WN (mean ≈ 0)",
"Differenced RW+Drift → WN + δ (mean ≈ 0.2)"),
yint = c(0, delta)),
aes(yintercept = yint), color = "red", linetype = "dashed") +
labs(title = "Verification: Differencing Recovers White Noise",
x = "Time", y = "Value") +
theme_minimal()The red dashed lines show the theoretical means. Differencing the RW produces noise centered at 0 (pure WN). Differencing the RW with drift produces noise centered at \(\delta = 0.2\) (WN + constant). This confirms the mathematical relationship. ✓
cat("Mean of differenced RW: ", mean(rw_diff), "\n")Mean of differenced RW: 0.0277581
cat("Mean of differenced RW+Drift: ", mean(rw_drift_diff), "\n")Mean of differenced RW+Drift: 0.2277581
cat("Expected mean of WN: 0\n")Expected mean of WN: 0
cat("Expected mean of WN + delta: 0.2\n")Expected mean of WN + delta: 0.2
The sample means are close to the theoretical values (small deviations are expected due to finite sample size).
Comparison Table
| Property | White Noise | Random Walk | RW with Drift |
|---|---|---|---|
| Trend | None | None (no systematic direction) | Linear trend (direction set by sign of \(\delta\)) |
| Shock | Independent at each \(t\); no accumulation | Each shock \(\varepsilon_t\) permanently changes the level | Each shock permanently changes the level |
| Memory | None — past shocks have zero effect on future | Permanent — every past shock is embedded in \(Y_t\) forever | Permanent — same as RW |
| Stationarity | Yes — constant mean (0) and constant variance (\(\sigma^2\)) | No — mean is 0 but variance (\(t\sigma^2\)) grows without bound | No — both mean (\(\delta t\)) and variance (\(t\sigma^2\)) change over time |
| Forecast | Flat line at the mean (0) | Flat line at the last observed value \(Y_T\) | Trending line from last value: \(Y_T + \delta h\) for \(h\) steps ahead |