Time Series: Stochastic process

We have seen a number of examples of time series in our last two chapters. We can now say that a time series is a collection of observations sequentially in time.
Our interest will not be in such series that are deterministic but rather in those whose values behave according to the laws of probability.
As such, each observation \(x_t\) at time t, of a time series is a realization of a random variable \(X_t\). In this chapter, we will discuss the fundamentals involved in the statistical analysis of time series. To begin, we must be more careful in our definition of a time series. Actually, a time series is a special type of stochastic process.

Definition

A time series is a stochastic process \(\{X_t|t\in T\}\), a collection of random variables \(X_t\) sequentially over a time index set \(T\). If \(T\) takes on values on the set \(T=\{0, 1, 2,...\}\) or \(T=\{0,\pm 1,\pm 2,...\}\), we refer to as discrete parameter time series.In case, \(T=(-\infty ,\infty )\) or \(T=(0, \infty)\), the series is continuous parameter process.

An important part of the analysis of a time series is the selection of a suitable probability model (or class of models) for the data. To allow for the possibly unpredictable nature of future observations it is natural to suppose that each observation \(x_t\) is a realized value of a certain random variable \(X_t\) .

Simplification

A time series model for the observed data \(\{x_t\}\) is a specification of the joint distributions (or possibly only the means and covariances) of a sequence of random variables \(\{X_t\}\) of which \(\{x_t\}\) is postulated to be a realization.

Second-order or Weak Stationarity

A complete probabilistic time series model for the sequence of random variables \(\{X_1,X_2, . . .\}\) is avoided.
Instead we specify only the first- and second-order moments of the joint distributions, i.e., the expected values \(E(X_t)\) and the expected products \(E(X_{t+h}X_t ), t = 1, 2, . . ., h = 0, 1, 2, . . .\), focusing on properties of the sequence \(\{Xt\}\) that depend only on these. Such properties of \(\{X_t\}\) are referred to as second-order properties.

Measuring dependence

We now discuss various measures that describe the general behavior of a time series process as it evolves over time. While defining these measures, we shall be restricting our attention only to the second-order properties as stated before.

Let \(\{X_t\}\) be a time series with \(E(X_t)^2<\infty\).

The mean function of \(\{Xt\}\) is \(\mu_X(t)= E(X_t)\).

The covariance function of \(\{X_t\}\) is

\(\gamma_X(r, s) = Cov(X_r,X_s)\)

\(= E[(X_r-\mu_X(r))(X_s-\mu_X(s))]\)

for all integers r and s.

Weekly or Second-order stationary time series

A time series with finite variance process where

the mean value function, \(\mu_X(t)\), is constant and is independent of t, and
the autocovariance function, \(\gamma(r,s)\) depends on times s and t only through their time difference or lag \(|s-r|=h\).

Covariance Function

In view of the condition (ii), whenever we use the term covariance function with reference to a stationary time series \(\{X_t\}\), we shall mean the function \(\gamma_X\) of one variable, defined by

\(\gamma_X(h) = \gamma_X(t+h,t)\).

The function \(\gamma_X(.)\) will be referred to as the autocovariance function and \(\gamma_X(h)\) as its value at lag h.

Covariance function of a stationary process is not a function of time, but a function of time lag.

ACVF and ACF

Let \(\{X_t\}\) be a stationary time series. The autocovariance function (ACVF) of \(\{X_t\}\) at lag h is defined as \[\begin{equation} \gamma_X(h) =\gamma_X(t + h, t)= Cov(X_{t+h},X_t). \end{equation}\] The autocorrelation function (ACF) of \(\{X_t\}\) at lag h is \[\begin{equation} \rho_X(h) = \frac{\gamma_X(h)}{\gamma_X(0)}= Cor(X_{t+h},X_t). \end{equation}\]

Why Autocorrelation from Correlation?

correlation measures the linear association between a pair of variables, and is obtained by standardising the covariance, by dividing the covariance by the standard deviations of the variables.
A value of \(+1\) or \(-1\) indicates an exact linear association with the pairs falling on a straight line of positive or negative slope respectively.

Why?

In time series, observations tend to be serially correlated, the measure of linear dependence is called as autocorrelation.
The adjective “auto” ,which means self, is used to refer to the relation between the same variable at different time points.
Because it is a correlation, we have \(1\le \rho(h)\le 1\) for all \(h\), enabling one to assess the relative importance of a given autocorrelation value by comparing with the extreme values \(-1\) and \(1\).

ACVF and ACF of stationary time series

In this section, we will learn the properties of the autocovariance and autocorrelation functions for stationary time series. If a time series is weakly stationary, then the autocovariance function only depends on h. Thus, for stationary processes, we denote this autocovariance function by \(\gamma(h)\). Similarly, the autocorrelation function for a stationary process is given by \(\rho(h)=\frac{\gamma(h)}{\gamma(0)}\) . The autocovariance function of a stationary time series satisfies the following properties:

Theorem

\(\gamma(0)\ge 0\).
\(|\gamma(h)|\le\gamma(0)\)for all \(h\).
\(\gamma(.)\) is even, i.e., \(\gamma(h)=\gamma(-h)\) for all h.
The function \(\gamma(.)\) is positive semidefinite. That is, for any set of time points \(t_1, t_2,...,t_k \in T\) and and all real \(a_1, a_2,...,a_k\), we have \[\begin{equation} \sum_{i=1}^{k}\sum_{j=1}^{k}a_i\gamma(t_i-t_j)a_j\ge0. \end{equation}\]

Theorem

Proof. The first property is simply the statement that \(\gamma(0)=Var(X_t)\ge0\), the second is an immediate consequence of the fact that correlations are less than or equal to \(1\) in absolute value (or the Cauchy–Schwarz inequality), and the third is established by observing that \[\begin{equation*} \gamma(h)=Cov(X_{t+h},X_t)=Cov(X_t,X_{t+h})=\gamma(-h). \end{equation*}\] To prove (4), let \(W=\sum_{i=1}^{k}a_iX_(t_i)\). Now, \[\begin{align*} Var(W)&\ge 0\\ aD(X)a^T&\ge 0\\ \sum_{i=1}^{k}\sum_{j=1}^{k}a_i\gamma(t_i-t_j)a_j&\ge 0 \end{align*}\] and the result follows.

The equation (3.3) is equivalent to the following autocovariance matrix \[\begin{align} \Gamma_k & = \begin{pmatrix} 1 & \gamma_1 & \cdots & \gamma_k \\ \gamma_1 & 1 & \cdots & \gamma_{k-1} \\ \vdots & \vdots & \vdots & \vdots \\ \gamma_k & \gamma_{k-1} & \cdots & 1 \\ \end{pmatrix}, \end{align}\] is positive semidefinite for each k.

Autocorrelation

The autocorrelation function satisfies the following analogous properties: Theorem

\(\rho(0) = 1\).
\(|\rho(h)| \le 1\)for all \(h\).
\(\gamma(.)\) is even, i.e., \(\rho(h)=\rho(-h)\) for all h.
The function \(\rho(h)\) is positive semidefinite, and the matrix \[\begin{align} \mathrm{P}_k & = \begin{pmatrix} 1 & \rho_1 & \cdots & \rho_k \\ \rho_1 & 1 & \cdots & \rho_{k-1} \\ \vdots & \vdots & \vdots & \vdots \\ \rho_k & \rho_{k-1} & \cdots & 1 \\ \end{pmatrix}, \end{align}\] is positive semidefinite for each k.

Estimation of Autocorrelation

For data analysis, only the sample values, \(x_1, x_2, . . . , x_n,\) are available for estimating the mean, autocovariance, and autocorrelation functions. In this case, the assumption of stationarity becomes critical and allows the use of averaging to estimate the population mean and covariance functions. Accordingly, if a time series is stationary, the mean function \(\mu_t = \mu\) is constant so we can estimate it by the sample mean, \[\begin{equation} \overline{x} = \frac{1}{n} \sum_{i=1}^{n} x_i \end{equation}\] The sample autocovariance function is defined as

\[\begin{equation} \hat{\gamma}(h) = \frac{1}{n} \sum_{t=1}^{n-h} (x_{t+h} - \bar{x})(x_t - \bar{x}) \end{equation}\] for \(h=0,1,2,...,n-1\). The sample variance function is given by \[\begin{equation} \hat{\gamma}(0) = \frac{1}{n} \sum_{t=1}^{n} (x_{t} - \bar{x})^2 \end{equation}\]

Sample Autocorrelation

The sample autocorrelation function is defined as \[\begin{equation} \hat{\rho}(h) = \frac{\hat{\gamma}(h)}{\hat{\gamma}(0)} = \frac{\sum_{t=1}^{n-h} (x_{t+h} - \bar{x})(x_t - \bar{x})}{\sum_{t=1}^{n} (x_t - \bar{x})^2} \end{equation}\] for \(h=0,1,...n-1\). The sum in the numerator above runs over a restricted range because \(x_{t+h}\) is not available for \(t + h > n\). Note that we are in fact estimating the autocovariance function by \(\hat{\gamma}(h)\), with \(\hat{\gamma}(-h)=\hat{\gamma}(h)\) for \(h=0,1,...,n-1\). That is, we divide by \(n\) even though there are only \(n-h\) pairs of observations at lag h, \(\{(x_{t+h},x_t);t=1,...,n-h\}\).

This assures that the sample autocovariance function will behave as a true autocovariance function, and for example, will not give negative values when estimating \(var(\overline(x))\) by replacing \(\gamma_x(h)\) with \(\hat{\gamma}_x(h)\).

Large-Sample Distribution of the ACF

If \(x_t\) is white noise, then for large \(n\) and under mild conditions, the sample ACF, \(\hat{\gamma_x}(h)\), for \(h = 1, 2, . . . , H\), where \(H\) is fixed but arbitrary, is approximately normal with zero mean and standard deviation given by \(\frac{1}{\sqrt{n}}\).

Based on this property, we obtain a rough method for assessing whether a series is white noise by determining how many values of \(\hat{\rho}(h)\) are outside the interval \(\pm 1.96\frac{1}{\sqrt{n}}\) or (two standard errors); for white noise, approximately 95% of the sample ACFs should be within these limits.

Correlogram Analysis

A correlogram is a graphical tool used in time series analysis to study the dependence structure across time.

Core Idea: In a time series, observations are often correlated with their past values. A correlogram displays how this correlation changes as the time lag increases.
What is plotted? A plot of \(\rho (k)\) against \(k\) is called the correlogram.

It is important to understand the extent to which a plot of the autocorrelations for a given model describes the behavior in time series realizations from that model. We shall now consider a few basic stationary models to illustrate the behaviour of their correlation function.

How to read?

The horizontal axis shows \(lag(h)\).
The vertical axis shows autocorrelation values (between \(−1\) and \(+1\))
Bars (or spikes) represent correlation at each lag
Confidence bands help identify statistically significant correlations

Interpretation

Slow decay of autocorrelation → indicates trend / non-stationarity
Sharp cutoff after a few lags → suggests a short-memory process
Oscillating pattern → indicates seasonality or cyclic behavior
No significant spikes → resembles white noise

Usefulness

Helps detect serial dependence
Guides identification of models like AR, MA, or ARMA
Assists in checking stationarity and model adequacy
provides a compact visual summary of how the past influences the present in a time series.

Sample Correlogram Analysis

In practice, correlogram analysis relies on sample (empirical) counterparts of theoretical quantities, because the true distribution of the time series is unknown.
Why sample quantities are used? Population quantities are unknown
The theoretical ACF depends on:
- true mean
- true variance
- true covariance structure

These are unobservable, so we estimate them using sample moments.

Consistency and large-sample justification
Approximate sampling distribution

Example: White Noise Process

Let \(\{Xt\}\) be a sequence of uncorrelated random variables, each with zero mean and variance \(\sigma^2 < \infty\). Such a sequence is referred to as white noise (with mean 0 and variance \(\sigma^2\)). This is indicated by the notation \(X_t \sim WN(0,\sigma^2)\). Since the random variables are uncorrelated, the autocovariance function of the process can be written as, \[ \gamma_X(t+h, t) = \begin{cases} \begin{split} \sigma^2&,\text{if} & h=0, \\ 0 &, \text{if }& h\neq 0. \end{split} \end{cases} \] which does not depend on t . Hence white noise with finite second moment is stationary.

Example…

The following figure shows 200 simulated values of normally distributed \(iid (0, 1)\), denoted by \(iid N(0, 1)\), noise. Also, shows the corresponding sample autocorrelation function and partial autocorrelation function at lags \(0, 1, . . . , 40\). Since \(\rho(h)=0\) for h > 0, one would also expect the corresponding sample autocorrelations to be near 0.

y<-ts(rnorm(200))
layout(matrix(c(1,1,2,3),2,2,byrow=TRUE))
plot(y)
abline(h=0)
acf(y,40)
pacf(y,40)

Plot

200 simulated values of iid N(0,\(1\)) noise

What does the plot say?

The time plot conveys an apparent stationarity in the proess around the mean line passing through 0 and more or less a stable constant variance.
There is no evidence of presence of any significant sample autocorrelation.
Similarly no evidence of any significant partial autocorrelation.
Hence one can safely conclude that the process from which the data is generated is a iid process.

Example 2

our second example consists of realizations from a stationary AR(1) process. The series is plotted in the upper panel of the accompanying figure below which shows a wandering or piecewise trending behavior. Note that it is typical for \(x_t\) and \(x_{t+1}\) to be relatively close to each other;that is, the value of \(X_{t+1}\) is usually not very far from the value of \(X_t\), and, as a consequence, there is a rather strong positive correlation between the random variables \(X_t\) and say \(X_{t+1}\). Note also that, for large lags, \(k\), there seems to be less correlation between \(X_t\) and \(X_{t+k}\) to the extent that there is very little correlation between \(X_t\) and say \(X_{t+40}\). We see this behavior manifested in the down panels of the Figure, which displays the true autocorrelations associated with the model from which the realizations were generated. In this plot \(\rho_1 ≈ 0.96\) while as the lag increases, the autocorrelations decrease, and by lag 40 the autocorrelation has decreased to \(\rho_{40} ≈ 0.1\).

Codes

set.seed(86)
ts.sim <- arima.sim(list(order = c(1,0,0), ar = 0.96), n = 200)

layout(matrix(c(1,1,2,3), 2, 2, byrow = TRUE))
ts.plot(ts.sim)
acf(ts.sim, 40)
pacf(ts.sim,40)

Plot

Simulated data from an AR(1) process and corresponding sample ACF and PACF

Observations

\(\hat{\rho}(h)\) as a function of h is a very slowly decreasing function of h
for moderately high values of h, \(\hat{\rho}(h)\) is considerably high, indicating an \(AR(1)\) model with the value of AR coefficient to be close to \(1\).
This intuition is validated by the plot of sample ACF, which shows higher order autocorrelations are of considerable magnitude that contradicts stationarity of the time series.
The plot of partial autocorrelation coefficient is not able to convey anything more.

Stationarity and Correlogram Analysis

Time Series: Stochastic process

Definition

Simplification

Second-order or Weak Stationarity

Measuring dependence

Weekly or Second-order stationary time series

Covariance Function

ACVF and ACF

Why Autocorrelation from Correlation?

Why?

ACVF and ACF of stationary time series

Theorem

Autocorrelation

Estimation of Autocorrelation

Sample Autocorrelation

Large-Sample Distribution of the ACF

Correlogram Analysis

How to read?

Interpretation

Usefulness

Sample Correlogram Analysis

Example: White Noise Process

Example…

Plot

What does the plot say?

Example 2

Codes

Plot

Observations