Stationarity in Time Series

Author

1 1. What is Stationarity?

A stationary time series is one whose statistical properties do not change over time.

Mean is constant
Variance is constant
Autocovariance depends only on lag, not on time

https://towardsdatascience.com/stationarity-in-time-series-a-comprehensive-guide-8beabe20d68/

2 2. Types of Stationarity

Strict Stationarity
- The joint distribution of $(Y_t, Y_{t+k}, …)$ is the same no matter the time index. This is a very strong condition: it means not just the mean/variance, but all higher-order moments (skewness, kurtosis, etc.) are invariant over time.
- Strong requirement, rarely used directly in applications.

Weak (Covariance) Stationarity

Only the first two moments are time-invariant and much easier to satisfy:
- $E[Y_t] = \mu$ (constant mean)
- $Var(Y_t) = \sigma^2$ (constant variance)
- $Cov(Y_t, Y_{t+k})$ depends only on $k$ (lag)

2.0.1 Key Difference

Strict Stationarity → requires invariance of the entire probability distribution.
Covariance Stationarity → only requires invariance of the mean, variance, and autocovariance function.

2.0.2 In practice:

Econometricians and data scientists usually assume covariance stationarity, because it’s sufficient for most linear time series methods.
Strict stationarity is more theoretical and rarely testable with real data.

3 3. Why It Matters

Many econometric models (ARMA, ARIMA, VAR, GARCH) require stationarity.
Without it:
- Risk of spurious regressions (high $R^2$ but meaningless results).
- Standard errors and tests become unreliable.
Stationarity ensures sample moments converge to population moments.

4 4. Sources of Non-Stationarity

Trends

Deterministic: linear or polynomial trend.

Code

# --- Packages ---------------------------------------------------------------
suppressPackageStartupMessages({
  library(tseries)       # ADF, KPSS tests
  library(forecast)      # autoplot.ts, ggtsdisplay (optional)
})

set.seed(123)

# --- Deterministic trend: linear & quadratic --------------------------------
n          <- 300
time       <- 1:n
eps        <- rnorm(n, 0, 1)

y_lin      <- 0.5 + 0.02*time + eps                      # linear trend
y_poly     <- 0.5 + 0.02*time - 0.00005*time^2 + eps     # polynomial (quadratic)

# --- Stochastic trend: random walk ------------------------------------------
u          <- rnorm(n, 0, 1)
y_rw       <- cumsum(u)                                   # random walk (unit root)

# --- Quick plots -------------------------------------------------------------
par(mfrow = c(1,3))
plot.ts(y_lin,  main = "Deterministic: Linear Trend", ylab="y")
plot.ts(y_poly, main = "Deterministic: Quadratic Trend", ylab="y")
plot.ts(y_rw,   main = "Stochastic: Random Walk", ylab="y")

Stochastic: random walk.

Changing Variance

E.g., volatility clustering in finance (ARCH/GARCH).

Code

# amazon closing price clearly looks like a stochastic trend (unit-root-ish)
amzn <-
  gafa_stock |>
  filter(Symbol == "AMZN") |>
  as_tsibble(index = Date)

#autoplot(amzn, Close) +
#  labs(title = "gafa_stock (AMZN): price with stochastic trend", y = "Close")

# difference the log price to remove the stochastic trend
amzn_ret <-
  amzn |>
  mutate(ret = difference(log(Close))) |>
  drop_na()

#autoplot(amzn_ret, ret) +
#  labs(title = "AMZN: differenced log price (≈ stationary)", y = "log return")

# rolling volatility visualization
amzn_ret |>
  mutate(roll_sd = slider::slide_dbl(ret, sd, .before = 20, .complete = TRUE)) |>
  ggplot(aes(Date)) +
  geom_line(aes(y = ret)) +
  geom_line(aes(y = roll_sd)) +
  labs(title = "AMZN: returns and rolling SD (≈ volatility)",
       y = "return / rolling sd", x = NULL)

Code

# squared returns often show autocorrelation when volatility clusters
amzn_ret |>
  mutate(r2 = ret^2) |>
  ACF(r2) |>
  autoplot() +
  labs(title = "AMZN: ACF of squared returns → volatility persistence")

Structural Breaks

Policy change, financial crisis, regime shifts.

Code

# aggregate half-hourly demand to daily; visualize possible shifts
elec_daily <-
  vic_elec |>
  index_by(Date = as_date(Time)) |>
  summarise(Demand = sum(Demand))

autoplot(elec_daily, Demand) +
  labs(title = "vic_elec: daily demand (visual check for breaks)", x = NULL, y = "MWh")

Code

# real GDP growth for united states; look for break around the GFC
library(strucchange)

us_growth <-
  global_economy |>
  filter(Country == "United States") |>
  mutate(g = difference(log(GDP))) |>
  drop_na()

autoplot(us_growth, g) +
  labs(title = "United States: GDP growth (log diff of GDP)", y = "growth")

Code

# estimate possible break(s) in the mean of growth
bp <- breakpoints(g ~ 1, data = as.data.frame(us_growth))
bp


     Optimal 4-segment partition: 

Call:
breakpoints.formula(formula = g ~ 1, data = as.data.frame(us_growth))

Breakpoints at observation number:
11 21 47 

Corresponding to breakdates:
0.1964286 0.375 0.8392857

Code

autoplot(us_growth, g) +
  geom_vline(xintercept = year(breakdates(bp, breaks = 1)), linetype = 2, color = "red") +
  labs(title = "Break in mean growth (Bai–Perron 1-break)")

5 5. How to Achieve Stationarity

Detrending: Remove deterministic trend.
Differencing: $\Delta Y_t = Y_t - Y_{t-1}$.
Log Transformations: Stabilize variance.
Seasonal Differencing: Handle seasonal effects.

6 6. Testing for Stationarity

Visual Inspection: Time plot.
Correlogram (ACF/PACF): Slowly decaying ACF $\Rightarrow$ likely non-stationary.
Formal Tests:
- Augmented Dickey-Fuller (ADF)
- Phillips-Perron (PP)
- KPSS (tests stationarity as null)

Code

library(forecast) 
library(tseries)

set.seed(123) 
y1 <- arima.sim(model = list(ar = 0.6),
                n = 200) # stationary AR(1) 
y2 <- cumsum(rnorm(200)) # non-stationary random walk

#par(mfrow = c(2,1)) 
plot.ts(y1, main = "Stationary AR(1) Process")

Code

plot.ts(y2, main = "Non-Stationary Random Walk")

--- title: "Stationarity in Time Series" author: "AS" format: html: toc: true toc-location: left number-sections: true code-fold: show code-tools: true df-print: paged theme: cosmo pdf: toc: true number-sections: true geometry: margin=1in fig-pos: "H" fontsize: 11pt editor: visual code-link: true execute: echo: true warning: false message: false --- ```{r} #| include: false library(fpp3) ``` # 1. What is Stationarity? A **stationary time series** is one whose **statistical properties do not change over time**. - Mean is constant - Variance is constant - Autocovariance depends only on lag, not on time <https://towardsdatascience.com/stationarity-in-time-series-a-comprehensive-guide-8beabe20d68/> ![](images/clipboard-2081701676.png) ![](images/clipboard-2901189712.png) ![](images/clipboard-745830377.png) ------------------------------------------------------------------------ # 2. Types of Stationarity **Strict Stationarity**\ - The joint distribution of $(Y_t, Y_{t+k}, …)$ is the same no matter the time index. This is a very strong condition: it means not just the mean/variance, but all higher-order moments (skewness, kurtosis, etc.) are invariant over time.\ - Strong requirement, rarely used directly in applications. **Weak (Covariance) Stationarity** Only the **first two moments** are time-invariant and much easier to satisfy:\ - $E[Y_t] = \mu$ (constant mean)\ - $Var(Y_t) = \sigma^2$ (constant variance)\ - $Cov(Y_t, Y_{t+k})$ depends only on $k$ (lag) ![](images/clipboard-1178267664.png) ### Key Difference - **Strict Stationarity** → requires invariance of the **entire probability distribution**. - **Covariance Stationarity** → only requires invariance of the **mean, variance, and autocovariance function**. ### In practice: - Econometricians and data scientists usually assume **covariance stationarity**, because it’s sufficient for most linear time series methods. - Strict stationarity is more theoretical and rarely testable with real data. ------------------------------------------------------------------------ # 3. Why It Matters - Many econometric models (ARMA, ARIMA, VAR, GARCH) **require stationarity**.\ - Without it: - Risk of *spurious regressions* (high $R^2$ but meaningless results).\ - Standard errors and tests become unreliable.\ - Stationarity ensures sample moments converge to population moments. ------------------------------------------------------------------------ # 4. Sources of Non-Stationarity - **Trends** - Deterministic: linear or polynomial trend. ```{r} # --- Packages --------------------------------------------------------------- suppressPackageStartupMessages({ library(tseries) # ADF, KPSS tests library(forecast) # autoplot.ts, ggtsdisplay (optional) }) set.seed(123) # --- Deterministic trend: linear & quadratic -------------------------------- n <- 300 time <- 1:n eps <- rnorm(n, 0, 1) y_lin <- 0.5 + 0.02*time + eps # linear trend y_poly <- 0.5 + 0.02*time - 0.00005*time^2 + eps # polynomial (quadratic) # --- Stochastic trend: random walk ------------------------------------------ u <- rnorm(n, 0, 1) y_rw <- cumsum(u) # random walk (unit root) # --- Quick plots ------------------------------------------------------------- par(mfrow = c(1,3)) plot.ts(y_lin, main = "Deterministic: Linear Trend", ylab="y") plot.ts(y_poly, main = "Deterministic: Quadratic Trend", ylab="y") plot.ts(y_rw, main = "Stochastic: Random Walk", ylab="y") ``` ```{r} #| eval: false #| include: false # --- Stationarity checks ----------------------------------------------------- adf.test(y_lin) # often rejects unit root (trend-stationary w/ detrending) kpss.test(y_lin) # often rejects level-stationarity (needs trend term) adf.test(y_rw) # usually fails to reject (unit root present) kpss.test(y_rw) # usually rejects (not level-stationary) ``` - Stochastic: random walk. - **Changing Variance** - E.g., volatility clustering in finance (ARCH/GARCH). ```{r} # amazon closing price clearly looks like a stochastic trend (unit-root-ish) amzn <- gafa_stock |> filter(Symbol == "AMZN") |> as_tsibble(index = Date) #autoplot(amzn, Close) + # labs(title = "gafa_stock (AMZN): price with stochastic trend", y = "Close") # difference the log price to remove the stochastic trend amzn_ret <- amzn |> mutate(ret = difference(log(Close))) |> drop_na() #autoplot(amzn_ret, ret) + # labs(title = "AMZN: differenced log price (≈ stationary)", y = "log return") # rolling volatility visualization amzn_ret |> mutate(roll_sd = slider::slide_dbl(ret, sd, .before = 20, .complete = TRUE)) |> ggplot(aes(Date)) + geom_line(aes(y = ret)) + geom_line(aes(y = roll_sd)) + labs(title = "AMZN: returns and rolling SD (≈ volatility)", y = "return / rolling sd", x = NULL) # squared returns often show autocorrelation when volatility clusters amzn_ret |> mutate(r2 = ret^2) |> ACF(r2) |> autoplot() + labs(title = "AMZN: ACF of squared returns → volatility persistence") ``` - **Structural Breaks** - Policy change, financial crisis, regime shifts. ```{r} # aggregate half-hourly demand to daily; visualize possible shifts elec_daily <- vic_elec |> index_by(Date = as_date(Time)) |> summarise(Demand = sum(Demand)) autoplot(elec_daily, Demand) + labs(title = "vic_elec: daily demand (visual check for breaks)", x = NULL, y = "MWh") ``` ```{r} # real GDP growth for united states; look for break around the GFC library(strucchange) us_growth <- global_economy |> filter(Country == "United States") |> mutate(g = difference(log(GDP))) |> drop_na() autoplot(us_growth, g) + labs(title = "United States: GDP growth (log diff of GDP)", y = "growth") # estimate possible break(s) in the mean of growth bp <- breakpoints(g ~ 1, data = as.data.frame(us_growth)) bp autoplot(us_growth, g) + geom_vline(xintercept = year(breakdates(bp, breaks = 1)), linetype = 2, color = "red") + labs(title = "Break in mean growth (Bai–Perron 1-break)") ``` ------------------------------------------------------------------------ # 5. How to Achieve Stationarity - **Detrending**: Remove deterministic trend.\ - **Differencing**: $\Delta Y_t = Y_t - Y_{t-1}$.\ - **Log Transformations**: Stabilize variance.\ - **Seasonal Differencing**: Handle seasonal effects. ```{r} ``` ------------------------------------------------------------------------ # 6. Testing for Stationarity - **Visual Inspection**: Time plot.\ - **Correlogram (ACF/PACF)**: Slowly decaying ACF $\Rightarrow$ likely non-stationary.\ - **Formal Tests**: - Augmented Dickey-Fuller (ADF)\ - Phillips-Perron (PP)\ - KPSS (tests stationarity as null) ```{r} library(forecast) library(tseries) set.seed(123) y1 <- arima.sim(model = list(ar = 0.6), n = 200) # stationary AR(1) y2 <- cumsum(rnorm(200)) # non-stationary random walk #par(mfrow = c(2,1)) plot.ts(y1, main = "Stationary AR(1) Process") plot.ts(y2, main = "Non-Stationary Random Walk") ```