Last lecture:
Basic features: pseudo randomness, information set, stationarity (invariance).
This lecture:
Autoregressive Model (AR)
10/8/2018
Last lecture:
Basic features: pseudo randomness, information set, stationarity (invariance).
This lecture:
Autoregressive Model (AR)
Stationarity is about the unchanged structure of information (two moments)
But we need more specific ideas about how do these moments behave.
Parametric structure is a way of specifying the series.
For example, Fibonacci numbers: \(y_t = y_{t-1} + y_{t-2}\) comes from a recurrence relation (or called Difference equation).
\(y_t = y_{t-1} + y_{t-2}\) keeps growing.
A simple parametric form that can return/converge to a stable value: \[y_{t+1} = \phi y_{t}, \mbox{ for } |\phi|<1. \]
Given the time \(t=T\), no matter how large the value \(y_T\) is, after a sufficient long time period, say \(t\rightarrow\infty\), \(y_{\infty} \rightarrow 0\).
y = rep(0, 50); y[25] = 1 # At t=25, give y_t a value 1 (a shock)
for (t in 25:50){ y[t+1] = 0.8 * y[t]}
ts.plot(y)
y = rep(0, 50); y[25] = 1
for (t in 25:50){ y[t+1] = -0.8 * y[t]}
ts.plot(y)
The previous examples form the so called impulse response functions (IRF).
IRF: The reaction of the system following a shock.
Here the system is \[Y_{t+1} = \phi Y_t, \mbox{ for } |\phi|<1.\]
If \(|\phi|=1\) or \(|\phi|>1\), the story would be quite different.
y = rep(0, 50); y[25] = 1
for (t in 25:50){ y[t+1] = 1.2 * y[t]}
ts.plot(y)
y = rep(0, 50); y[25] = 1
for (t in 25:50){ y[t+1] = -1.2 * y[t]}
ts.plot(y)
Consider a stochastic version of the previous recurrent equation: \[Y_{t+1} = \phi Y_t + \varepsilon_t, \mbox{ for } |\phi|<1.\] where \(\varepsilon_t\) is a white noise.
AutoRegressive Model with one lagged variable: AR(1). A regression on itself.
We can extend to many lagged variables: AR(p) \[Y_{t+1} = \phi_1 Y_t + \phi_2 Y_{t-1} + \cdots + \phi_p Y_{t-p} + \varepsilon_t.\]
It has a recursive representation \[Y_{t+1}= \phi Y_{t} + \varepsilon_t = \phi (\phi Y_{t-1} + \varepsilon_{t-1}) + \varepsilon_t \\ = \phi^2 Y_{t-1} + \phi \varepsilon_{t-1} + \varepsilon_t = \phi^2 (\phi Y_{t-2} + \varepsilon_{t-2}) + \phi \varepsilon_{t-1} + \varepsilon_t \\ = \phi^{t}Y_{1}+\phi^{t-1}\varepsilon_{1} +\cdots+\phi\varepsilon_{t-1}+\varepsilon_{t}.\]
The series \(Y_{t+1}\) is constructed by its initial condition \(Y_1\) and a sequence of noises \(\{\phi^k \varepsilon_{t-k}\}_{k=0}^{t-1}\).
set.seed(2018)
Y = rep(0, 250); e = rnorm(250);
for (t in 1:250){ Y[t+1] = 0.8 * Y[t] + e[t]}
ts.plot(Y)
Y = rep(0, 250); e = rnorm(250);
for (t in 1:250){ Y[t+1] = -0.8 * Y[t] + e[t]}
ts.plot(Y)
Let the white noise sequence \(\varepsilon_t \sim \mathcal{WN}(0,\sigma^2)\), the initial condition (the shock) \(Y_1 = \mu\).
The mean is \(\mu\): \[\mathbb{E}[Y_t] = \mu + \mathbb{E}[\phi^{t-1}\varepsilon_{1} +\cdots+\phi\varepsilon_{t-1}+\varepsilon_{t}] \\ = \mu + \mathbb{E}[\phi^{t-1}\varepsilon_{1}] +\cdots+\mathbb{E}[\phi\varepsilon_{t-1}]+ \mathbb{E}[\varepsilon_{t}]=\mu.\]
The variance is: \[\mathbb{V}[Y_t] = \mathbb{V}(\mu) + \mathbb{V}[\phi^{t-1}\varepsilon_{1} +\cdots+\phi\varepsilon_{t-1}+\varepsilon_{t}] \\ = 0 + \mathbb{V}[\phi^{t-1}\varepsilon_{1}] +\cdots+\mathbb{V}[\phi\varepsilon_{t-1}]+ \mathbb{E}[\varepsilon_{t}]\\ =\sigma^2 (\phi^{2(t-1)} + \cdots + \phi^2 + 1) = \frac{\sigma^2 (1 - \phi^{2t+1}) }{1-\phi^2}.\] when \(t\rightarrow \infty\), \(\phi^{2t+1}\rightarrow 0\). So \(\mathbb{V}[Y_\infty]= \sigma^2/(1-\phi^2)\).
Note that \(\sum_{j=0}^{\infty}\phi^{2j}=\frac{1}{1-\phi^2}<\infty\) for \(|\phi|<1\).
Autocovariance: \[ \mbox{Cov}(Y_t, Y_{t-1})=\mathbb{E}[(Y_{t}-\mu)(Y_{t-1}-\mu)]\\ =\mathbb{E}[\phi(Y_{t-1}-\mu)(Y_{t-1}-\mu)]+\mathbb{E}[\varepsilon_{t}(Y_{t-1}-\mu)]= \phi \mathbb{V}(Y_{t-1}).\]
Recall that \(\mathbb{V}[Y_\infty]= \sigma^2/(1-\phi^2)\) for large time \(t\).
Let’s suppose that the system is far from the initial position (shock), then the autocovariance becomes \(\mbox{Cov}(Y_t, Y_{t-1})= \phi \sigma^2/(1-\phi^2).\)
The mean of \(Y_t\) is a constant.
Suppose \(t\) is large, the covariance of AR(1) only depends on the number of lags \(j\): \[\gamma_{j}=\mathbb{E}[(Y_{t}-\mu)(Y_{t-j}-\mu)]= \mathbb{E}[\phi(Y_{t-1}-\mu)(Y_{t-j}-\mu)] \\ +\mathbb{E}[\varepsilon_{t}(Y_{t-j}-\mu)] =\phi\gamma_{j-1}\\ \Longrightarrow\gamma_{j}=\phi^{j}\gamma_{0} =\phi^{j}\frac{\sigma^{2}}{1-\phi^{2}}.\]
Y = rep(0, 2000); e = rnorm(2000);
for (t in 1:2000){ Y[t+1] = 0.8 * Y[t] + e[t]}
mean(Y[1:1000])
## [1] 0.006715935
mean(Y[1001:2000])
## [1] -0.1709464
var(Y[1:500])
## [1] 2.693767
var(Y[1501:2000])
## [1] 2.887367
acf(Y)
A concise notation: \(Y_{t-1}=\mathbb{L}Y_{t}\), \(\mathbb{L}\) is called the lag operator.
The previous AR(p) \(Y_{t+1} = \phi_1 Y_t + \phi_2 Y_{t-1} + \cdots + \phi_p Y_{t-p} + \varepsilon_t\) becomes: \[\phi(\mathbb{L})Y_{t}= \varepsilon_{t}\] \[\phi(\mathbb{L})= 1-\phi_{1}\mathbb{L}-\cdots-\phi_{p}\mathbb{L}^{p}\]
Mean adjusted form: \[Y_{t}-\mu=\phi_{1}(Y_{t-1}-\mu)+\cdots+\phi_{p}(Y_{t-p}-\mu)+\varepsilon_{t}.\]
Lag operator notation: \[\phi(\mathbb{L})(Y_{t}-\mu)= \varepsilon_{t}\] where \(\phi(\mathbb{L})= 1-\phi_{1}\mathbb{L}-\cdots\phi_{p}\mathbb{L}^{p}\).
The mean becomes \[\mathbb{E}[Y_{t}]= \mu +\phi\mathbb{E}[Y_{t-1}]+\mathbb{E}[\varepsilon_{t}]\\ = \mu +\phi\mathbb{E}[Y_{t}]\\ \Rightarrow \mathbb{E}[Y_{t}]=\frac{\mu}{1-\phi}.\]
library(forecast) y = arima.sim(n=250,list(ar=c(1,-0.25),ma=0)) plot(y); abline(h=0)
\(Y_{t+1} = Y_{t} + \varepsilon_t\) is not stationary. \(\phi=1\).
In the previous example, the AR(2) looks like stationary. But it has \(\phi_1=1\) in its expression \[ Y_{t+1} = Y_{t} - 0.25 Y_{t-2} +\varepsilon_t.\]
How do we check the stationarity for the general AR(p).
AR(p): \(\phi(\mathbb{L})Y_t = \varepsilon\).
Characteristic equation: \(\phi(z) = 0\) where \(z\) is formally treated as a number (real or complex).
The roots of the characteristic equation (i.e., the polynomial \(\phi(z)\)) must all exceed unity in absolute value for the process to be stationary.
\(AR(1): Y_t = 0.5Y_{t-1} +\varepsilon_t\) is stationary. \(1-0.5z=0\) gives \(z=2\).
\(AR(2): Y_t = -0.25Y_{t-2} +\varepsilon_t\) is stationary. \((4+z^2)/4=0\) gives \(z=\pm 2i\). The absolute value of \(\pm 2i\) is \(2\).
\(AR(2): Y_t = Y_{t-1} - 0.25Y_{t-2} +\varepsilon_t\) is stationary. \((z^2-4z+4)/4=(z-2)^2/4=0\) gives \(z=2\).
y1 = arima.sim(n=1000,list(ar=c(0.6,-0.28),ma=0)) plot(y1)
y2 = arima.sim(n=1000,list(ar=c(0.8,-0.98),ma=0)) plot(y2)
ar20 = arima(y, c(2, 0, 0), include.mean=FALSE, method="ML") ar21 = arima(y1, c(2, 0, 0), include.mean=FALSE, method="ML") ar22 = arima(y2, c(2, 0, 0), include.mean=FALSE, method="ML") polyroot(c(1, -ar20$coef))
## [1] 1.409780+0i 2.978273-0i
polyroot(c(1, -ar21$coef))
## [1] 1.487931+1.617687i 1.487931-1.617687i
polyroot(c(1, -ar22$coef))
## [1] 0.4063493+0.9239186i 0.4063493-0.9239186i
AR models: a regression model on itself. Motivation, Specification.
Stationarity of AR(1): mean, variance, autocovariance.
Stationarity of AR(p): depending on the roots of its characteristic equation.