ADEC7406 Module 3 Discussion

Author

Fabian Yang

Setup

Motivation

I decided to choose three monthly FRED series: CPIAUCSL (Consumer Price Index for All Urban Consumers: All Items in U.S. City Average), INDPRO (Industrial Production: Total Index), and TCU (Capacity Utilization: Total Index).

The motivation is that this mix may be useful to see the differences: CPI and INDPRO usually violate stationarity, while TCU is often already stationary or near stationary (hopefully).

First, let us load the essential packages.

library(fpp3)
library(fredr)
library(tseries)
library(tsibble)
library(ggplot2)
library(lubridate)

fredr_set_key(Sys.getenv("FRED_API_KEY"))

start_date <- as.Date("1990-01-01")
end_date   <- as.Date("2025-01-01")

Load Data

cpi_raw <- fredr(
  series_id = "CPIAUCSL",
  observation_start = start_date,
  observation_end   = end_date
)

ind_raw <- fredr(
  series_id = "INDPRO",
  observation_start = start_date,
  observation_end   = end_date
)

tcu_raw <- fredr(
  series_id = "TCU",
  observation_start = start_date,
  observation_end   = end_date
)

cpi_ts <- cpi_raw |> select(date, value) |> as_tsibble(index = date)
ind_ts <- ind_raw |> select(date, value) |> as_tsibble(index = date)
tcu_ts <- tcu_raw |> select(date, value) |> as_tsibble(index = date)

1. Stationarity Tests

CPIAUCSL

CPIAUCSL (basically CPI) is a price level and clearly trends upward. Trend means the mean changes over time, which violates stationarity. I would expect non-stationary in level but stationary after first difference of log.

cpi_ts <- cpi_ts |>
  mutate(lcpi  = log(value),
         dlcpi = difference(lcpi))
adf.test(cpi_ts$lcpi)

    Augmented Dickey-Fuller Test

data:  cpi_ts$lcpi
Dickey-Fuller = -1.5615, Lag order = 7, p-value = 0.7628
alternative hypothesis: stationary
adf.test(na.omit(cpi_ts$dlcpi))
Warning in adf.test(na.omit(cpi_ts$dlcpi)): p-value smaller than printed
p-value

    Augmented Dickey-Fuller Test

data:  na.omit(cpi_ts$dlcpi)
Dickey-Fuller = -6.1006, Lag order = 7, p-value = 0.01
alternative hypothesis: stationary

We first test the log level of CPIAUCSL using the ADF test. The null hypothesis is that the series contains a unit root (non-stationary). For \(\log(\text{CPIAUCSL})\), the p-value is 0.7628, which is greater than 0.05. Therefore, we do not reject the null hypothesis and conclude that \(\log(\text{CPIAUCSL})\) is non-stationary. This result is expected (CPI exhibits a clear upward trend over time, violating the constant mean assumption).

Next (just for fun), we difference the log series. For \(\text{diff}(\log(\text{CPIAUCSL}))\), the ADF test returns a p-value smaller than 0.01. Since this is below 0.05, we reject the null hypothesis and conclude that the differenced series is stationary, which is again expected. Economically, this transformation represents inflation, which fluctuates around a stable mean rather than trending upward.

INDPRO

Industrial production has long-run growth and cycles. Again, this violates constant mean, so this is not stationary.

ind_ts <- ind_ts |>
  mutate(lind  = log(value))

adf.test(ind_ts$lind)

    Augmented Dickey-Fuller Test

data:  ind_ts$lind
Dickey-Fuller = -1.6304, Lag order = 7, p-value = 0.7337
alternative hypothesis: stationary

For this series, the p-value is 0.7337, which is greater than 0.05. Therefore, we do not reject the null hypothesis (unit root) and conclude that this series is non-stationary.

TCU

Capacity utilization is a percentage between 0 and 100 and usually fluctuates around a stable mean. So it may be stationary. For this series, we test directly using KPSS.

kpss.test(tcu_ts$value, null = "Level")
Warning in kpss.test(tcu_ts$value, null = "Level"): p-value smaller than
printed p-value

    KPSS Test for Level Stationarity

data:  tcu_ts$value
KPSS Level = 2.8368, Truncation lag parameter = 5, p-value = 0.01

The p-value is smaller than 0.01 (significantly smaller than 0.05), so we reject the null hypothesis that this series is stationary and conclude that this series is non-stationary.

Again, just for fun, we difference it once:

tcu_ts <- tcu_ts |> mutate(dtcu = difference(value))
adf.test(na.omit(tcu_ts$dtcu))
Warning in adf.test(na.omit(tcu_ts$dtcu)): p-value smaller than printed p-value

    Augmented Dickey-Fuller Test

data:  na.omit(tcu_ts$dtcu)
Dickey-Fuller = -6.7612, Lag order = 7, p-value = 0.01
alternative hypothesis: stationary

After first differencing, we apply the ADF test, where the null hypothesis is a unit root (as we have seen previously). The p-value is smaller than 0.01, so we reject the null hypothesis and conclude that the differenced series is stationary. Reasons that make TCU non-stationary may include business cycles and structural breaks (take COVID as an example).

2. ACF vs. PACF

Stationary Series

We must use stationary versions. Luckily, we have the stationary series for CPIAUCSL and TCU. We need to make INDPRO stationary as well:

ind_ts <- ind_ts |>
  mutate(dlind = difference(lind))

adf.test(na.omit(ind_ts$dlind))
Warning in adf.test(na.omit(ind_ts$dlind)): p-value smaller than printed
p-value

    Augmented Dickey-Fuller Test

data:  na.omit(ind_ts$dlind)
Dickey-Fuller = -6.9441, Lag order = 7, p-value = 0.01
alternative hypothesis: stationary

After first differencing, we have dlind, which is stationary (reasoning is the same as we have seen in the previous section).

CPIAUCSL

cpi_m <- cpi_ts |>
  mutate(date = yearmonth(date)) |>
  as_tsibble(index = date) |>
  fill_gaps()

cpi_m |>
  filter(!is.na(dlcpi)) |>
  ACF(dlcpi) |>
  autoplot()

cpi_m |>
  filter(!is.na(dlcpi)) |>
  PACF(dlcpi) |>
  autoplot()

The ACF of \(\text{diff}(\log(\text{CPI}))\) shows a strong positive spike at lag 1 followed by a cutoff, indicating a somewhat AR(1) model. The PACF displays a sharp cutoff after lag 1, with most higher lags statistically insignificant. Overall, I would say this pattern is broadly consistent with an AR(1) process (the ACF does also show a gradual decay), but I would not be surprised if it is also categorized as a ARMA model (for instance, ARMA(1,2) looks right).

INDPRO

ind_m <- ind_ts |>
  mutate(date = yearmonth(date)) |>
  as_tsibble(index = date) |>
  fill_gaps()

ind_m |>
  filter(!is.na(dlind)) |>
  ACF(dlind) |>
  autoplot()

ind_m |>
  filter(!is.na(dlind)) |>
  PACF(dlind) |>
  autoplot()

The ACF shows significant spikes at the first two lags followed by a rapid cutoff, while the PACF gradually decays. This suggests that industrial production growth is primarily driven by short-run shocks with limited persistence. Overall, I would say this is a somewhat perfect example of a MA(2) model.

TCU

tcu_m <- tcu_ts |>
  mutate(date = yearmonth(date)) |>
  as_tsibble(index = date) |>
  fill_gaps()

tcu_m |>
  filter(!is.na(dtcu)) |>
  ACF(dtcu) |>
  autoplot()

tcu_m |>
  filter(!is.na(dtcu)) |>
  PACF(dtcu) |>
  autoplot()

Looking at the ACF graph, we can see a clear cutoff after the first lag. However, the PACF graph shows a decay after lag 2. Consistent with these patterns, the selected model corresponds to a low-order moving-average specification. Overall, I would say that TCU exhibits MA(1) behavior.

3. Decomposition Methods

cpi_m |>
  model(stl = STL(value)) |>
  components() |>
  autoplot() +
  labs(title = "CPIAUCSL Decomposition (STL)")

ind_m |>
  model(stl = STL(value)) |>
  components() |>
  autoplot() +
  labs(title = "INDPRO Decomposition (STL)")

tcu_m |>
  model(stl = STL(value)) |>
  components() |>
  autoplot() +
  labs(title = "TCU Decomposition (STL)")

The STL decomposition of CPIAUCSL reveals a strong upward trend over the entire sample period, indicating non-stationarity in the level series. Seasonal fluctuations are present but relatively small compared to the trend component (scale of ~0.8 vs 300). Therefore, non-stationarity is primarily driven by the deterministic or stochastic trend rather than seasonality. First differencing is sufficient to achieve stationarity (which we did).

INDPRO exhibits a clear long-run trend interspersed with business cycles. Unlike CPI, the seasonal component here is substantial relative to the trend, indicating that production is highly sensitive to recurring yearly patterns. The decomposition also highlights the severity of exogenous shocks: while the 2008 crisis is smoothed into the trend component, the 2020 pandemic appears as a massive outlier in the remainder component, indicating a shock too abrupt for the trend-cycle to capture fully.

The decomposition of TCU differs from the others by showing mean-reverting cyclical behavior around a slowly declining long-term average. Rather than a deterministic upward trend, the series exhibits a structural downward drift (lower peaks in recent decades). Like INDPRO, the remainder component is dominated by a sharp negative spike in 2020, confirming that recent non-stationarity is driven by distinct structural breaks rather than standard seasonality.