Applied Time Series

2018/10/01

Outline

Last lecture:

Motivation and main theme
Your first glance on time series

This lecture:

Deterministic and random time series
Some basic features

Random or Not

A random variable is often given by definition.
Rolling a dice is assumed to be a random action.
Possible states for \(X\): \(\Omega=\{1, 2, 3, 4, 5, 6\}\)
Every time, \(1/6\) probability for each state.
All these are only true if the dice is “normal”.

Random or Not

Consider a time series for \(X\).
Your rolls (in a computer game) \(X_1 = 3\), \(X_2 = 3\), \(X_3 = 3\), …
After 3 times, you should question the randomness property. (About \(\frac{1}{43}=0.02\))
The information for possible states is \(\mathcal{I}= \{3\}\subset\Omega\).
How about these rolls: \(X_1 = 2, X_2 =4, X_3 = 3, X_4 = 1\)?
It may be difficult to detect whether they are random or not (before \(t<8\)).

Return to the loops

x = rep(0,10); x[1] = 2
for(t in 1:12){x[t+1]=(2*x[t]+3) %% 5}
x+1

##  [1] 3 3 3 3 3 3 3 3 3 3 3 3 3

x[1] = 1
for(t in 1:12){x[t+1]=(2*x[t]+1) %% 5}
x+1

##  [1] 2 4 3 1 2 4 3 1 2 4 3 1 2

Current computer, a deterministic device, cannot generate real random numbers.

Information Set for Deterministic Series

\(\mathcal{I}_t\): Up to time \(t\), the collection of possible states.
The whole information for a time series: \(\lim_{t\rightarrow \infty }\mathcal{I}_t = \mathcal{I}\).
Information set may be invariant, e.g. \(\mathcal{I}= \{3\}\subset\Omega\) or \(\mathcal{I}= \{1, 2, 3, 4\}\subset\Omega\)
Equilibrium: invariant information set. \(\{3\}\): unique equilibrium. \(\{1, 2, 3, 4\}\): multiple equilibrium.

Information Set for Deterministic Series

Information set may change, e.g. the information set for Fibonacci sequence \(\mathcal{I} = \{ 0, 1, 2, 3, 5, 8, 13, 21, 34 ,\dots\}\) is growing. No equilibrium exists.
Information can be extracted for deterministic time series, although the underlying law could be quite complicated.
How about a real random \(X\)?

Recall some Properties of Randomness

\(X\), \(Y\) are random.
\(X\) and \(Y\) have means \(\mathbb{E}[X]=\mu_X\) and \(\mathbb{E}[Y]=\mu_Y\).
\(X\) and \(Y\) have variances \(\mathbb{V}[X]=\sigma^2_X\) and \(\mathbb{V}[Y]=\sigma^2_Y\) where \[\mathbb{V}[X] = \mathbb{E} \left[ X - \mathbb{E}[X]\right]^2. \]
\(k\)-th Moments (\(k=3\) skewness, \(k=4\) kurtosis): \(\mathbb{E}[X^k]\) and \(\mathbb{E}[Y^k]\).
Probability distributions: \(X \sim \mathcal{N}(\mu_X,\sigma^2_X)\), \(Y \sim \mathcal{N}(\mu_Y,\sigma^2_Y)\).

Two Orders of Information: Mean and Variance

A random variable (r.v.) \(Y\). Denote the realization of \(Y\) as \(y\).
All information of \(Y\) is stored in the distribution \(P_Y(y)\): \(Y \sim P_Y(y)\) where \(P_Y(y) =\Pr(Y < y)\).
Means, variances, moments are some measurements (statistics) of the information, e.g. Mean: \[\mathbb{E}[Y] = \int y d P_Y(y) \mbox{ (continuous r.v.)}\] \[ \mathbb{E}[Y] = \sum_{i=1}^{N} y_i P_Y(y_i) \mbox{ (discrete r.v.)}.\]

Information between Two Variables

Covariance: \[\mbox{Cov}(Y,X)=\mathbb{E}\left[(Y-\mu_Y)(X-\mu_X)\right]=\gamma_{YX}.\]
The “cross” variance of the information of \(Y\) and \(X\) \[\mbox{Cov}(Y,X)=\int\left[(y-\mu_Y)(x-\mu_X)\right] dP_{YX}(y,x)\] where \(P_{YX}(y,x)\) is the joint distribution.

Information between Two Variables

Independent: \(\mathbb{E}[Y X] = \mathbb{E}[Y] \mathbb{E}[X]\).
If \(Y\) and \(X\) are independent, then \(\mbox{Cov}(Y,X) = 0\). (The converse is not true. Check for \(X\sim U[-1,1]\) and \(Y = X^2\).)
Covariance (\(\mbox{Cov}(Y,X) \neq 0\) ) are measures of linear dependence between two variables.

Two Dimensional Joint Distribution

Consider two normal dependent variables \(X,Y\). Their joint distribution is: \[X, Y \sim \mathcal{N\left(\left(\begin{array}{c} 0\\ 0 \end{array}\right),\left(\begin{array}{cc} 1 & 0.6\\ 0.6 & 1 \end{array}\right)\right)}\]

Information in Random Time Series

(Random) time series is a sequence of random variables.
Current information depends on the past information: the distribution of \(Y_t\) depends on that of \(Y_{t-1}\). (If not, we are back to the i.i.d. case.)
For example, \(Y_{t-1} \sim \mathcal{N}(0,1)\), then \[Y_{t}= Y_{t-1} + 0.5 \sim \mathcal{N}(0.5,1).\] \(Y_{t}\) and \(Y_{t-1}\) are dependent.
How do you measure the information for the whole sequence \(\{Y_t\}_{t=1}^{T}=\{Y_1, Y_2, Y_3, \dots\, Y_T \}\)?

Invariant for the Distribution

Joint distribution \(P_{Y_1, Y_2, Y_3, \dots}(Y_1,Y_2,Y_3,\dots)\) such that \[(Y_1, Y_2, Y_3, \dots) \sim P_{Y_1, Y_2, Y_3, \dots}(Y_1,Y_2,Y_3,\dots).\]
Invariant (equilibrium type) property for the whole sequence: some “changes” will not twist the whole sequence.
(Strong) Stationarity: Change of a specific time \(t\) doesn’t matter.

Strong Stationarity

Probability distribution of \(\{ Y_1, Y_2, Y_3 \}\) is invariant under a shift in time \(k\): \[ P_{Y_1, Y_2, Y_3} (y_1, y_2, y_3)= \Pr\{ Y_1<y_1, Y_2<y_2, Y_3<y_3 \} \\ =\Pr\{ Y_{1+k}<y_1, Y_{2+k}<y_2, Y_{3+k}<y_3 \} \\ = P_{Y_{1+k}, Y_{2+k}, Y_{3+k}}(y_1, y_2, y_3).\]

Strong Stationarity

Definition for strong stationarity: Invariant in any time shift \(k>0\) \[ P_{Y_t, Y_t+1, Y_t+2,\dots} (y_t, y_{t+1}, y_{t+2},\dots) \\ = P_{Y_{t+k}, Y_{t+k+1}, Y_{t+k+2}, \dots}(y_t, y_{t+1}, y_{t+2},\dots) \] for any \(t>0\).
Comparable to an unaltered distribution function.
This is an ideal situation: you rarely found it in economic and finance applications. (The market tends to refuse the equilibrium.)

Covariance (Weak) Stationarity

Instead of considering the whole distribution of the sequence, only two orders of the distribution will be considered.
The distribution may change but the essential information is unchanged.
Constant mean.
Covariance is independent of time shift: The specific \(t\) time doesn’t matter. Only the lagged differences matter.

Covariance (Weak) Stationarity

Weakly Stationary \(Y_t\) has the following conditions on the first two moments:

\(\mathbb{E}[Y_{t}]=\mu\) for all \(t\).
Given \(t\), for any \(j\geq 0\) \[\mbox{Cov}(Y_{t},Y_{t+j})=\mathbb{E}\left[(Y_{t}-\mu)(Y_{t+j}-\mu)\right]=\gamma_{j}.\] \[\mbox{Cov}(Y_{t+k},Y_{t+k+j})=\mathbb{E}\left[(Y_{t+k}-\mu)(Y_{t+k+j}-\mu)\right]=\gamma_{j}.\]

Rmarks:

\(\gamma_{j}\): the \(j\)-th lag autocovariance, \(\gamma_{0}=\mathbb{V}(Y_{t})\).
\(\rho_{j}=\frac{\gamma_{j}}{\gamma_{0}}\) is \(j\)-th lag autocorrelation.
The \(n\) sample autocovariance is \[\frac{1}{n} \sum_{t=1}^{n-j} (y_t - \bar{y})(y_{t+j}- \bar{y})\] where \(\bar{y}\) is the sample mean \(\bar{y} = \frac{1}{n} \sum_{t=1}^{n}y_t\).
Although it weakens the information requirement, it is demanding for most of the economic and finance data.

Trivial Example of Stationarity (i.i.d.)

(Standardized) White Noise \(WN(0,\sigma^{2})\)

\[Y_{t}= \varepsilon_{t}\] \[\mathbb{E}[\varepsilon_{t}]= 0,\;\mathbb{V}(\varepsilon_{t})=\sigma^{2},\:\mbox{Cov}(\varepsilon_{t},\varepsilon_{t+j})=0\]

Gaussian White Noise \(GWN(0,\sigma^{2})\)

\[Y_{t}=\varepsilon_{t},\qquad\varepsilon_{t}\sim \: \mathcal{N}(0,\sigma^{2})\] where \(\varepsilon_{1}, \varepsilon_{2}, \varepsilon_{3},\dots\) are i.i.d. r.v..

(Standardized) Gaussian White Noise

set.seed(2018); e = as.ts(rnorm(250)); plot(e); abline(h=0)

(Standardized) Gaussian White Noise

mean(e[1:50])

## [1] -0.08035105

mean(e[51:150])

## [1] 0.09075301

(Standardized) Gaussian White Noise

acf(e)

(Standardized) Gaussian White Noise

acf(e, plot=FALSE)$acf[1]

## [1] 1

acf(e, plot=FALSE)$acf[2]

## [1] -0.1304663

Non-stationary Random Walk

y = rep(0,250)
for(t in 1:250){y[t+1]=y[t]+e[t]}
plot(y); abline(h=0)

Non-stationary Random Walk

mean(y[1:50])

## [1] -1.45842

mean(y[101:150])

## [1] 1.38388

Non-stationary Random Walk

acf(y)

Non-stationary Random Walk

acf(y, plot=FALSE)$acf[1]

## [1] 1

acf(y, plot=FALSE)$acf[2]

## [1] 0.9846287

Summary

Interpret the information for both deterministic and random time series.
Invariant information pattern can exist in both series.
Invariant (equilibrium) property is too demanding and is rarely found in the superficial layer of the observable economic world.
But it is a baseline for modeling.