library(tidyverse)
library(forecast)
library(tseries)

Stationary

Strictly Stationary

A strictly stationary time series is one for which the probabilistic behavior of every collection of values: \[\{x_{t_1}, x_{t_2},\dots, x_{t_k}\}\] is identical to that of the time shifted set \[\{x_{t_1+h}, x_{t_2+h},\dots, x_{t_k+h}\}\] That is,

\[P\{x_{t_1}\leq c_1,\dots,x_{t_k}\leq c_k\} = P\{x_{t_1+h}\leq c_1,\dots,x_{t_k+h}\leq c_k\}\] for all \(k = 1, 2,\dots\), all time points \(t_1, t_2,\dots, t_k\), all numbers \(c_1, c_2,\dots, c_k\), and all time shifts $h = 0,,,$.

In more simplistic terms if \(k=1\) then for any two time steps \(P\{x_s\leq c\} = P\{x_t \leq c\}\)

This implies if the variance function of the process exists, the autocovariance function of the series \(x_t\) satisfies

\[\gamma(s, t) =\gamma(s + h, t + h)\]

for all \(s\) and \(t\) and \(h\). We may interpret this result by saying the autocovariance function of the process depends only on the time difference between \(s\) and \(t\), and not on the actual times.

Weakly Stationary

A weakly stationary time series, \(x_t\), is a finite variance process such that:

  1. the mean value function, \(\mu_t\), is constant and does not depend on time \(t\), and
  2. the autocovariance function, \(\gamma(s, t)\), depends on \(s\) and \(t\) only through their difference \(|s-t|\).

The above conditions can be writtenas:

  • \(E[X_t] = \mu\) (constant mean)
  • \(Var(X_t) = \sigma^2\) (constant variance)
  • \(Cov(X_t, X_{t+h})\) depends only on the lag \(h\), not on \(t\)

From now on we will use the term stationary to mean weakly stationary; if a process is stationary in the strict sense, we will use the term strictly stationary.

It should be clear from the discussion of strict stationarity following the definition that a strictly stationary, finite variance, time series is also stationary.

The converse is not true unless there are further conditions. One important case where stationarity implies strict stationarity is if the time series is Gaussian [meaning all finite distributions of the series are Gaussian].

Visual Examples of Stationarity

set.seed(1234)
# Stationary: white noise
wn <- ts(rnorm(200, mean = 0, sd = 1))


ggplot(data.frame(time = 1:200, value = wn), aes(x = time, y = value)) +
geom_line(color = "steelblue") +
labs(title = "White Noise: A Stationary Time Series", x = "Time", y = "Value")
## Don't know how to automatically pick scale for object of type <ts>. Defaulting
## to continuous.

Question 1

Simulate another white noise series with a different seed and compare visually. Do they both look stationary?

Example: Trend

trend <- ts(1:200 + rnorm(200, mean = 0, sd = 10))


ggplot(data.frame(time = 1:200, value = trend), aes(x = time, y = value)) +
geom_line(color = "darkred") +
labs(title = "Non-Stationary Time Series", x = "Time", y = "Value")
## Don't know how to automatically pick scale for object of type <ts>. Defaulting
## to continuous.

Question 2

What rule of stationarity does the graph above break?

Example

set.seed(456)
var_series <- c(rnorm(100, 0, 1), rnorm(100, 0, 5))


ggplot(data.frame(time = 1:200, value = var_series), aes(x = time, y = value)) +
geom_line(color = "purple") +
labs(title = "Non-Stationary Series", x = "Time", y = "Value")

Question 3

What rule of stationarity does the graph above break?

Testing Stationarity

One common statistical test for weak stationarity is the Augmented Dickey-Fuller (ADF) test.

adf.test(wn)
## Warning in adf.test(wn): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  wn
## Dickey-Fuller = -5.8727, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
adf.test(trend)
## Warning in adf.test(trend): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  trend
## Dickey-Fuller = -5.7636, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary

Question 4

Run the ADF test on the var_series. What is the conclusion?

Autocorrelation Function (ACF)

The Autocorrelation Function (ACF) is a tool used to measure the relationship between observations in a time series and their past values at different time lags. In other words, it quantifies how strongly the series is correlated with itself over time. For a stationary time series, the autocorrelations depend only on the lag between points, not on the actual position in time. The ACF is especially helpful for identifying whether patterns such as persistence, seasonality, or trends are present in the data. For example, a white noise series (completely random) will show very small autocorrelations near zero at all lags, while a strongly autocorrelated series will display significant correlations at early lags that taper off gradually.

When interpreting an ACF plot, the x-axis represents the lag (how many time steps apart observations are), and the y-axis represents the correlation strength at each lag. The horizontal dashed lines usually indicate a confidence interval (often 95%), and any spike that extends beyond these lines suggests statistically significant correlation at that lag. A slow decay of correlations across many lags often signals non-stationarity, such as in a random walk. Sharp spikes at seasonal intervals can indicate periodic behavior. By examining the pattern of autocorrelations, analysts can better understand the structure of the time series and decide on appropriate transformations or models, such as choosing ARIMA parameters.

acf(wn, main = "ACF of White Noise")

Question 5 Plot the ACF of the trend series. What do you observe compared to white noise?

Random Walk (Non-Stationary)

A random walk:

\[X_t = X_{t-1} + \epsilon_t\]

is not stationary, since the variance grows with time.

set.seed(101)
random_walk <- cumsum(rnorm(200))


ggplot(data.frame(time = 1:200, value = random_walk), aes(x = time, y = value)) +
geom_line(color = "brown") +
labs(title = "Random Walk: Non-Stationary", x = "Time", y = "Value")

Question 6

Apply the ADF test to the random walk. What do you conclude?

Seasonality and Stationarity

Series with seasonality are not stationary unless the seasonal pattern is removed.

seasonal <- ts(sin(2 * pi * (1:200)/12) + rnorm(200, 0, 0.5))


ggplot(data.frame(time = 1:200, value = seasonal), aes(x = time, y = value)) +
geom_line(color = "blue") +
labs(title = "Seasonal Time Series", x = "Time", y = "Value")
## Don't know how to automatically pick scale for object of type <ts>. Defaulting
## to continuous.

Question 7

Why is time series data with seasonal trends not stationary?

Expectation, Variance, Covariance,and Correlation Properties

You may consult the properties below to help with the following questions

  • \(E(aX+bY)=aE(X)+bE(Y)\)
  • \(Var(X)\geq0\)
  • \(Var(X+Y)=Var(X)+Var(Y)\) if \(X\) and \(Y\) are independent
  • \(Cov(a+bX,c+dY)=bdCov(X,Y)\)
  • \(Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)\)
  • \(Cov(X+Y,Z)=Cov(X,Z)+Cov(Y,Z)\)
  • \(Cov(X,X)=Var(X)\)
  • \(\rho=Corr(X,Y)=\frac{Cov(X,Y)}{\sqrt{Var(X)Var(Y)}}\)
  • \(Corr(a+bX,c+dY)=sign(bd)Corr(X,Y)\)

Question 8

Suppose \(Y_t = 5 + 2t + X_t\), where \(\{X_t\}\) is a zero-mean stationary series with autocovariance function \(\gamma_k\).

Part a

Find the mean function for \(\{Y_t\}\).

Part b

Find the autocovariance function for \(\{Y_t\}\).

Part c

Is \(\{Y_t\}\) stationary? Why or why not?

Question 9

Let \(\{X_t\}\) be a stationary time series, and define \[Y_t=\begin{cases}X_t,\hspace{.4in}\text{If }t \text{ is even}\\ X_t+3,\hspace{.1in}\text{If }t \text{ is odd} \end{cases}\]

Part a

Show that \(Cov(Y_t,Y_{t-1})\) is free of t for all lags k.

Part b

Is \(\{Y_t\}\) stationary?

Question 10

Suppose \(Cov(X_t,X_t − k) = \gamma_k\) is free of \(t\) but that \(E(X_t) = 3t\).

Part a

Is \(\{X_t\}\) stationary?

Part b

Let \(Y_t = 7 − 3t + X_t\). Is \(\{Y_t\}\) stationary?

Question 11

Let \(Y_t = \epsilon_t − \theta(\epsilon_{t − 1})^2\). For this exercise, assume that the white noise series is normally distributed.

Part a

Find the autocorrelation function for \(\{Y_t\}\).

Part b

Is \(\{Y_t\}\) stationary?