Module 8 Discussion: ARIMA, Seasonal ARIMA, and WN/RW/Drift

Author

Student

Published

April 9, 2026

# knitr::opts_chunk$set(include = FALSE)
library(fredr)
library(forecast)
library(tseries)
library(ggplot2)
library(gridExtra)

fredr_set_key("d6d98961e60f83744f3b36f9d0f37582")

Data: U.S. Retail Trade Sales (RSXFS)

I chose RSXFS because it has the properties which stress-test ARIMA’s assumptions of an upward trend, strong December seasonality whose amplitude grows with the level, and the COVID-era structural break. This is a series where differencing decision will have real consequences.

The seasonal amplitude grows with the level, which is a classic sign of multiplicative seasonality. A log transform will stabilize the variance and make the seasonality additive.

rsxfs_log <- log(rsxfs)

autoplot(rsxfs_log) +
  ggtitle("Log(RSXFS)") +
  ylab("Log(Millions $)") +
  theme_minimal()

Part I: Non-Seasonal ARIMA: Manual Selection vs. Auto

Step 1: Assess Stationarity

The ADF test (p = 0.6178) fails to reject the null of non-stationarity. The KPSS test (p = 0.01) rejects the null of stationarity. Both point the same direction: this series is non-stationary. We difference once.

rsxfs_d1 <- diff(rsxfs_log, differences = 1)

autoplot(rsxfs_d1) +
  ggtitle("First Difference of Log(RSXFS)") +
  ylab("Differenced Value") +
  theme_minimal()

adf.test(rsxfs_d1)

Warning in adf.test(rsxfs_d1): p-value smaller than printed p-value


    Augmented Dickey-Fuller Test

data:  rsxfs_d1
Dickey-Fuller = -6.5764, Lag order = 4, p-value = 0.01
alternative hypothesis: stationary

kpss.test(rsxfs_d1)

Warning in kpss.test(rsxfs_d1): p-value greater than printed p-value


    KPSS Test for Level Stationarity

data:  rsxfs_d1
KPSS Level = 0.060546, Truncation lag parameter = 4, p-value = 0.1

After one difference: ADF (p = 0.01) rejects non-stationarity, KPSS (p = 0.1) fails to reject stationarity. Both confirm d = 1 is sufficient.

cat("Variance of original:     ", var(rsxfs_log), "\n")

Variance of original:      0.0292284

cat("Variance at d=1:          ", var(diff(rsxfs_log, differences = 1)), "\n")

Variance at d=1:           0.0005978095

cat("Variance at d=2:          ", var(diff(rsxfs_log, differences = 2)), "\n")

Variance at d=2:           0.001338211

Variance drops from 0.0292 to 0.000598 at d=1, then increases to 0.00134 at d=2. That increase is our signal. At d=2 we would be overdifferencing; introducing an artificial MA structure and amplifying noise.

Step 2: ACF and PACF of the Differenced Series

par(mfrow = c(2, 1))
Acf(rsxfs_d1, lag.max = 36,
    main = "ACF of Differenced Log(RSXFS)")
Pacf(rsxfs_d1, lag.max = 36,
     main = "PACF of Differenced Log(RSXFS)")

Reading the non-seasonal lags (1-11):

The ACF shows significant spikes at lags 1 and 2, then cuts off. This is the signature of an MA(2) process.
The PACF decays gradually through the early lags rather than cutting off sharply, which is consistent with MA behavior (PACF tails off for MA processes, cuts off for AR).

At the seasonal lags (12, 24), there are prominent spikes in both ACF and PACF, indicating strong seasonality. We set that aside for Part II.

Manual selection: ARIMA(0, 1, 2)

p = 0: No AR terms. The PACF doesn’t cut off cleanly at a low lag; it decays, which points to MA rather than AR.
d = 1: Confirmed by unit root tests and the variance check.
q = 2: The ACF cuts off after lag 2, giving us two significant MA terms.

Step 3: Auto ARIMA (Non-Seasonal)

auto_ns <- auto.arima(rsxfs_log,
                      seasonal = FALSE,
                      stepwise = FALSE,
                      approximation = FALSE)
summary(auto_ns)

Series: rsxfs_log 
ARIMA(0,1,2) with drift 

Coefficients:
          ma1      ma2   drift
      -0.1826  -0.2781  0.0041
s.e.   0.0867   0.0858  0.0012

sigma^2 = 0.0005526:  log likelihood = 283.67
AIC=-559.33   AICc=-558.99   BIC=-548.15

Training set error measures:
                       ME       RMSE        MAE          MPE       MAPE
Training set 9.291295e-05 0.02311977 0.01127746 0.0005223104 0.08621465
                  MASE        ACF1
Training set 0.2054318 0.007468978

Auto.arima selected ARIMA(0,1,2) with drift, which exactly matches matching my manual reading. The coefficients:

ma1 = −0.183: The model corrects ~18% of last month’s forecast error. Modest correction.
ma2 = −0.278: It also corrects ~28% of the error from two months ago. This is actually a larger correction than ma1, suggesting the two-month-ago error carries more predictive weight, possibly reflecting bimonthly reporting patterns or delayed consumer spending effects.
drift = 0.0041: A small but consistent upward drift of ~0.4% per month in log terms, which translates to roughly 5% annual growth in retail sales.

The AICc is −558.99. Since my manual selection matches exactly, no further reconciliation is needed. The ACF/PACF reading and the algorithmic optimization converged on the same model.

Step 4: Forecast

n <- length(rsxfs_log)
h <- 12
train <- window(rsxfs_log, end = time(rsxfs_log)[n - h])
test <- window(rsxfs_log, start = time(rsxfs_log)[n - h + 1])

fit_train <- auto.arima(train,
                        seasonal = FALSE,
                        stepwise = FALSE,
                        approximation = FALSE)
fc <- forecast(fit_train, h = h)

autoplot(fc) +
  autolayer(test, series = "Actual", PI = FALSE, linewidth = 1.1) +
  ggtitle("Non-Seasonal ARIMA Forecast vs Actual: Log(RSXFS)") +
  ylab("Log(Millions $)") +
  xlab("Year") +
  guides(colour = guide_legend(title = "Series")) +
  theme_minimal()

Warning in ggplot2::geom_line(ggplot2::aes(x = .data[["timeVal"]], y =
.data[["seriesVal"]], : Ignoring unknown parameters: `PI`

The forecast is a smooth upward line. It captures the drift but completely misses the seasonal oscillation. The actual values weave above and below the forecast as the months cycle. The prediction intervals widen steadily, which is the accumulating uncertainty from the random walk component (d=1). This is exactly the limitation that motivates adding seasonal terms.

Part II: Seasonal ARIMA on RSXFS

Fit Full SARIMA

sarima_fit <- auto.arima(rsxfs_log,
                         seasonal = TRUE,
                         stepwise = FALSE,
                         approximation = FALSE)
summary(sarima_fit)

Series: rsxfs_log 
ARIMA(0,1,2) with drift 

Coefficients:
          ma1      ma2   drift
      -0.1826  -0.2781  0.0041
s.e.   0.0867   0.0858  0.0012

sigma^2 = 0.0005526:  log likelihood = 283.67
AIC=-559.33   AICc=-558.99   BIC=-548.15

Training set error measures:
                       ME       RMSE        MAE          MPE       MAPE
Training set 9.291295e-05 0.02311977 0.01127746 0.0005223104 0.08621465
                  MASE        ACF1
Training set 0.2054318 0.007468978

A critical finding: auto.arima selected ARIMA(0,1,2) with drift, the same model as Part I, with no seasonal terms. It decided the seasonal parameters (P, D, Q) were not worth the AICc penalty.

This is worth interrogating. The ACF/PACF clearly showed seasonal spikes at lags 12 and 24. So why did the algorithm pass on seasonal terms? The likely reason: on this particular 2016-present window, the COVID disruption created so much noise in the seasonal pattern that the algorithm judged the seasonal parameters to be unreliable. The AICc penalty for the extra parameters outweighed the improvement in fit. This is the algorithm being conservative, not necessarily correct.

To demonstrate what a seasonal model looks like, I force seasonal differencing:

sarima_forced <- Arima(rsxfs_log,
                       order = c(0, 1, 2),
                       seasonal = list(order = c(0, 1, 1), period = 12))
summary(sarima_forced)

Series: rsxfs_log 
ARIMA(0,1,2)(0,1,1)[12] 

Coefficients:
          ma1      ma2     sma1
      -0.1460  -0.2986  -1.0000
s.e.   0.0913   0.0907   0.1011

sigma^2 = 0.0005911:  log likelihood = 239.89
AIC=-471.79   AICc=-471.4   BIC=-461.02

Training set error measures:
                        ME       RMSE      MAE          MPE       MAPE
Training set -0.0001333308 0.02266301 0.012017 -0.001149936 0.09169032
                  MASE       ACF1
Training set 0.2189035 0.01079473

coef(sarima_forced)

       ma1        ma2       sma1 
-0.1460497 -0.2986256 -0.9999886

Interpreting the forced SARIMA(0,1,2)(0,1,1)[12] coefficients:

ma1, ma2: Same role as before, correcting the non-seasonal forecast errors at lags 1 and 2.
sma1 (Seasonal MA): This coefficient adjusts the current forecast using the forecast error from the same month last year. If negative (typical), a positive error last January pulls this January’s forecast downward. This is a self-correcting mechanism at the seasonal frequency.
D = 1 (Seasonal differencing): We subtract $Y_{t-12}$ from $Y_t$. This is the blunt subtraction I find worth questioning because it assumes the seasonal pattern is perfectly rigid from year to year. For retail sales, where e-commerce is shifting the seasonal shape, that’s a strong assumption.

Diagnostics

checkresiduals(sarima_fit)


    Ljung-Box test

data:  Residuals from ARIMA(0,1,2) with drift
Q* = 23.754, df = 22, p-value = 0.3603

Model df: 2.   Total lags used: 24

Ljung-Box p-value = 0.3603, well above 0.05, so we fail to reject white noise. The residuals show no remaining systematic structure. The histogram is roughly normal. The model passes its diagnostic checks.

However, passing diagnostics doesn’t mean the model has captured everything it could. The residual ACF at seasonal lags might show marginal spikes that fall just inside the significance bands. That structure is that’s present but not statistically significant given the sample size. This is where the question of information loss becomes practical: the algorithm decided that structure wasn’t worth modeling, but it’s still there.

Seasonal ARIMA Forecast

sarima_train <- auto.arima(train,
                           seasonal = TRUE,
                           stepwise = FALSE,
                           approximation = FALSE)
sarima_fc <- forecast(sarima_train, h = h)

autoplot(sarima_fc) +
  autolayer(test, series = "Actual", PI = FALSE, linewidth = 1.1) +
  ggtitle("Seasonal ARIMA Forecast vs Actual: Log(RSXFS)") +
  ylab("Log(Millions $)") +
  xlab("Year") +
  guides(colour = guide_legend(title = "Series")) +
  theme_minimal()

Warning in ggplot2::geom_line(ggplot2::aes(x = .data[["timeVal"]], y =
.data[["seriesVal"]], : Ignoring unknown parameters: `PI`

Since auto.arima chose the same non-seasonal model for both, the forecasts look identical. The seasonal pattern in the actuals is visible but unmodeled. This result itself is informative. It illustrates how the AICc-driven model selection can choose parsimony over pattern capture when the data is noisy enough (the COVID-era volatility effectively drowned out the seasonal signal in the algorithm’s cost-benefit calculation).

For a cleaner comparison, the forced SARIMA(0,1,2)(0,1,1)[12] forecast would show the seasonal oscillation:

sarima_forced_train <- Arima(train,
                              order = c(0, 1, 2),
                              seasonal = list(order = c(0, 1, 1), period = 12))
sarima_forced_fc <- forecast(sarima_forced_train, h = h)

autoplot(sarima_forced_fc) +
  autolayer(test, series = "Actual", PI = FALSE, linewidth = 1.1) +
  ggtitle("Forced SARIMA(0,1,2)(0,1,1)[12] Forecast vs Actual") +
  ylab("Log(Millions $)") +
  xlab("Year") +
  guides(colour = guide_legend(title = "Series")) +
  theme_minimal()

Warning in ggplot2::geom_line(ggplot2::aes(x = .data[["timeVal"]], y =
.data[["seriesVal"]], : Ignoring unknown parameters: `PI`

Part III: White Noise, Random Walk, and Random Walk with Drift

The Three Processes: Equations

1. White Noise (WN): \[Y_t = \varepsilon_t, \quad \varepsilon_t \overset{iid}{\sim} N(0, \sigma^2)\]

$\varepsilon_t$: independent random shock at time $t$
No dependence on any past value. Each observation is a fresh draw

2. Random Walk (RW): \[Y_t = Y_{t-1} + \varepsilon_t\]

Expanding recursively: $Y_t = Y_0 + \sum_{i=1}^{t} \varepsilon_i$

Today = yesterday + random shock
Every shock is permanently embedded in the current value through cumulative summation

3. Random Walk with Drift: \[Y_t = \delta + Y_{t-1} + \varepsilon_t\]

Expanding: $Y_t = Y_0 + \delta t + \sum_{i=1}^{t} \varepsilon_i$

$\delta$: constant drift added each period, representing the systematic trend component
The series drifts by $\delta$ per period on average, plus accumulated noise

The Connection: Differencing Is the Bridge

Difference a RW: \[Y_t - Y_{t-1} = \varepsilon_t \quad \Rightarrow \quad \text{White Noise}\]

Difference a RW with Drift: \[Y_t - Y_{t-1} = \delta + \varepsilon_t \quad \Rightarrow \quad \text{White Noise} + \text{constant}\]

Differencing undoes the accumulation. A Random Walk is ARIMA(0,1,0), and differencing once ($d=1$) returns us to stationarity. This is the mechanical heart of the “I” in ARIMA.

Simulation

set.seed(42)
n <- 300
sigma <- 1
delta <- 0.3

wn <- rnorm(n, mean = 0, sd = sigma)
rw <- cumsum(rnorm(n, mean = 0, sd = sigma))
rw_drift <- cumsum(rnorm(n, mean = delta, sd = sigma))

par(mfrow = c(3, 1), mar = c(4, 4, 3, 1))

plot(wn, type = "l", col = "steelblue",
     main = "White Noise: Pure Randomness",
     ylab = "Value", xlab = "Time")
abline(h = 0, lty = 2, col = "gray50")

plot(rw, type = "l", col = "darkorange",
     main = "Random Walk: Accumulated Randomness",
     ylab = "Value", xlab = "Time")

plot(rw_drift, type = "l", col = "firebrick",
     main = "Random Walk with Drift (delta = 0.3): Accumulated Randomness + Trend",
     ylab = "Value", xlab = "Time")

Visual comparison:

WN: Static around zero. No direction, no memory. Each value is independent of the last.
RW: Wanders unpredictably. It can look like it has trends, but those are illusory. There is no systematic direction. The variance fans out over time.
RW with Drift: Wanders and climbs. The drift $\delta = 0.3$ acts as an escalator underneath the random fluctuations.

Verify via Differencing

par(mfrow = c(2, 1), mar = c(4, 4, 3, 1))

plot(diff(rw), type = "l", col = "steelblue",
     main = "Differenced Random Walk -> White Noise",
     ylab = "Value", xlab = "Time")
abline(h = 0, lty = 2, col = "gray50")

plot(diff(rw_drift), type = "l", col = "firebrick",
     main = "Differenced RW with Drift -> White Noise + Constant",
     ylab = "Value", xlab = "Time")
abline(h = delta, lty = 2, col = "black", lwd = 2)

cat("Mean of differenced RW:          ", round(mean(diff(rw)), 4),
    " (should be approx 0)\n")

Mean of differenced RW:           -0.0276  (should be approx 0)

cat("Mean of differenced RW w/ Drift: ", round(mean(diff(rw_drift)), 4),
    " (should be approx", delta, ")\n")

Mean of differenced RW w/ Drift:  0.2358  (should be approx 0.3 )

The differenced RW has mean −0.0276, close to zero. The differenced RW with drift has mean 0.2358, reasonably close to the true $\delta = 0.3$. The gap is sampling variability. With 300 observations, the estimate is noisy. Both results confirm the theoretical relationship.

Real-World Demonstration: RSXFS as RW with Drift (and COVID Regime Break)

rsxfs_diff <- diff(rsxfs_log)

par(mfrow = c(2, 1), mar = c(4, 4, 3, 1))

plot(rsxfs_log, main = "Log(RSXFS): Resembles RW with Drift",
     ylab = "Log(Millions $)", col = "firebrick", lwd = 1.5)

plot(rsxfs_diff,
     main = "Differenced Log(RSXFS): WN + Constant (Note COVID Shock)",
     ylab = "Month-over-Month Change", col = "steelblue", lwd = 1)
abline(h = mean(rsxfs_diff), lty = 2, col = "black", lwd = 2)

cat("Estimated drift (mean of differenced series):", round(mean(rsxfs_diff), 5), "\n")

Estimated drift (mean of differenced series): 0.00412

cat("Estimated sigma (sd of differenced series):  ", round(sd(rsxfs_diff), 5), "\n")

Estimated sigma (sd of differenced series):   0.02445

The estimated drift is 0.00412, meaning retail sales grow about 0.4% per month in log terms (~5% annually). The standard deviation is 0.02445. The differenced series looks approximately like white noise with a small positive mean, except for the COVID shock in early 2020, which is a massive outlier.

That COVID shock is exactly the kind of event that exposes the constant-$\delta$ assumption. The drift didn’t just change. It collapsed and then overcorrected. A constant drift model treats that entire episode as noise, absorbing it into a wider $\sigma$ rather than recognizing it as a regime shift. The prediction interval built from that inflated $\sigma$ would be too wide during stable periods and still too narrow for another structural break. This is the core limitation: the model can’t distinguish between randomness and structural change.

Summary Comparison: Applied to RSXFS

Property	White Noise ARIMA(0,0,0)	Random Walk ARIMA(0,1,0)	RW with Drift ARIMA(0,1,0)+drift
AIC	-81.76 (worst fit)	-550.29	-551.71 (best fit)
sigma²	0.0292 (huge residual variance)	0.000611	0.000599 (lowest)
Trend	None. Ignores the upward movement in retail sales entirely	None, but wanders. Captures level shifts but not the systematic growth	Captures the ~5% annual growth in retail sales ($\delta$ = 0.00412/month)
Shock	Each month independent. COVID has no lasting effect in the model	COVID shock accumulates permanently into the level	COVID shock accumulates permanently, plus the drift continues regardless
Memory	None. January tells you nothing about February	Permanent. Every past shock is baked into the current retail sales level	Permanent. All past shocks persist, plus systematic drift compounds
Stationarity	Assumes yes. Treats RSXFS as fluctuating around a fixed mean, which it clearly does not	No. Variance grows over time, which matches the widening spread in RSXFS	No. Both mean and variance grow, which matches the trending and spreading behavior of RSXFS
Forecast	Flat line at the series mean (~$12.1 log). Useless for a trending series	Flat line from the last observed value. No direction, just “tomorrow = today”	Upward trending line from the last value at slope $\delta$ = 0.0041/month. Best matches RSXFS behavior
COVID Impact	Treated as just another data point around the mean	The drop and recovery are permanent level shifts absorbed into the walk	The drop and recovery are permanent, and the constant drift assumption is exposed as too rigid