11/12/2017

Outline

Last lecture:

  • ARIMA: removal of non-stationary patterns (unit roots and seasonality) recovers stationarity.

This lecture:

  • Extracting trends, checking unit root.

  • Study a system of two time series.

Linear Filter Models

  • We knew: a time series \(Y_t\) is generated from a series of independent noises \(\varepsilon_t\).

  • In other words, a noise process \(\varepsilon_t\) is transformed to the process \(Y_t\) by a so-called linear filter \[Y_t = \Psi(\mathbb{L}) \varepsilon_t\] where \(\Psi (\cdot)\) is called the transfer function of the filter.

Linear Filter Models

  • ARIMA(p,1,q) model \[\phi(\mathbb{L})(1-\mathbb{L}) Y_t =\theta(\mathbb{L})\varepsilon_t \\ \mbox{or: } (1-\mathbb{L}) Y_t = \phi(\mathbb{L})^{-1}\theta(\mathbb{L})\varepsilon_t \] where \((1-\mathbb{L})\) is the difference filter, \(\theta(\mathbb{L})\) is the MA filter, and \(\phi(\mathbb{L})\) is the AR filter.

  • Similar, for seasonal ARIMA\((p,1,q)(P, 1, Q)_{d}\): \[ \phi (\mathbb{L}) \Phi (\mathbb{L}^d) (1 - \mathbb{L}) (1 - \mathbb{L}^d) Y_t = \theta (\mathbb{L}) \Theta (\mathbb{L}^d) \varepsilon_t\] where \(\Phi (\mathbb{L}^d)\), \((1 - \mathbb{L}^d)\), \(\Theta (\mathbb{L}^d)\) are seasonal filters.

Linear Filter Models

  • The linear filter is an additive linear function of previous (or future) variables.

  • If the filter only uses previous (lagged) variables, then it is a one side filter. AR, MA filters are one side filter.

  • One naive two sides filter (\(S(\mathbb{L})\)): \[S(\mathbb{L})Y_t = \sum_{j=-k}^{k} a_j Y_{t-j},\] \(k\) is the periodicity of the seasonality. When \(k=6\) is to remove the annual cycle in monthly data \(S(\mathbb{L})Y_t = \sum_{j=−6}^{6} a_j Y_{t−j}\) where \(a_j = 1/12\) for \(j = 0, \pm 1,..., \pm 5\) and \(a_{\pm 6} = 1/24\).

One Naive Filter

library(Quandl); library(xts); library(lubridate); library(forecast)
Bel =  Quandl("ECB/STS_M_BE_N_UNEH_LTT000_4_000", type = "xts", 
              collapse = "monthly", start_date="2000-01-31")
a = rep(1,11)
a = c(0.5, a, 0.5)
a = a/sum(a)   
trend = filter(Bel, sides=2, a) 

One Naive Filter

ts.plot(Bel); lines(trend, col="red") 

Detrended Series

ts.plot(Bel - trend)

Unit Root Test

  • Unit root test is to determine if a stochastic trend is present. The most popular test is the augmented Dickey-Fuller (ADF) test.

  • Suppose that \(Y_{t}\) is difference stationary: \[\nabla Y_{t} = (1- \mathbb{L})Y_t = \mbox{Stationary Components}. \]

  • The intuition of ADF test is to express the original process with \(p\) number of stationary components \(\nabla Y_{t-j}\), \(j=1,\dots, p\).

Augmented Dickey-Fuller Test

  • ADF tests use a parametric autoregressive structure to capture serial correlation

  • Test regression \[Y_{t}= \beta D_{t}+\phi Y_{t-1}+\sum_{j=1}^{p}\psi_{j}\nabla Y_{t-j}+\varepsilon_{t}\] \[D_{t}= \mbox{Deterministic terms}\] \[\nabla Y_{t-j}\: \mbox{captures serial correlation}\]

  • \(\nabla Y_{t-j}\) are stationary. If \(Y_t\) is I(1), then it is the case only if \(\phi=1\).

Augmented Dickey-Fuller Test

  • ADF t-statistic is \[\mbox{ADF}_{t}= t_{\phi=1}=\frac{\hat{\phi}-1}{\mbox{s.e.}(\hat{\phi})}\]

  • ADF test tests the null hypothesis that a time series \(Y_t\) is I(1) against the alternative that it is I(0): \[H_{0}: \quad\phi=1\quad(\phi(z)=0\mbox{ has a unit root})\] \[H_{1}: \quad|\phi|<1\quad(\phi(z)=0\mbox{ has roots outside the unit circle})\]

Simulation

library(tseries)
y = arima.sim(n=1000,list(order = c(0, 1, 0)))
adf.test(y)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  y
## Dickey-Fuller = -2.2119, Lag order = 9, p-value = 0.4886
## alternative hypothesis: stationary

Belgian Data

adf.test(Bel)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  Bel
## Dickey-Fuller = -2.2102, Lag order = 6, p-value = 0.4877
## alternative hypothesis: stationary
adf.test(na.omit(diff(Bel))) # difference filter
## 
##  Augmented Dickey-Fuller Test
## 
## data:  na.omit(diff(Bel))
## Dickey-Fuller = -6.7769, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary

System of Equations

  • ARIMA: a single variable model.

  • A system: joint behavior of multiple variables.

  • Multiple time series.

  • It is only when the dynamic characteristics of a system are understood that intelligent direction, manipulation, and control of the system become possible.

A Simple System

  • In static cases, regression explores the relationship between two variables. Predictor variables either directly cause the response or provide a plausible explanation of it.

  • We can represent a dynamic relationship connecting two series by a linear filter.

  • Pairs of time series \((X_t, Y_t)\), e.g. a production system with one input resource \(X\) and one output product \(Y\). A steady-state relationship \[ Y_t = \Psi (\mathbb{L}) X_t \] \(Y_t\) is a function of \(X_t\). The filter \(\Psi (\mathbb{L})\) is called the steady-state gain.

A Simple System

  • Consider a linear filter of the form: \[ \begin{align} Y_t =& \psi_0 X_t + \psi_1 X_{t-1} +\cdots \\ =& (\psi_0 + \psi_1 \mathbb{L} + \psi_2 \mathbb{L}^2 +\cdots) X_t = \Psi (\mathbb{L}) X_t \end{align} \]

  • \(\Psi (\cdot )\) is the transfer function of the filter.

  • \(\psi_0,\psi_1,\psi_2, \dots\) are the impulse response function of the system. (Recall the case in AR models.)

  • \(\psi_j\) may be regarded as the \(Y\)’s response at times \(j\) to a unit pulse input of \(X_{t-j}\) at time \(t-j\).

System with Two Series

  • A more general form \[ \begin{align} \delta_0 Y_t + \delta_1 Y_{t-1} + \cdots =& \omega_0 X_t + \omega_1 X_{t-1} +\cdots \\ \delta(\mathbb{L}) Y_t= & \omega(\mathbb{L}) X_t \\ Y_t =& \delta^{-1}(\mathbb{L}) \omega(\mathbb{L}) X_t = \Psi(\mathbb{L}) X_t \end{align} \]

  • A more more general form \[ \begin{align} Y_t= & \delta^{-1}(\mathbb{L}) \omega(\mathbb{L}) X_t + n_0 \varepsilon_t + n_1 \varepsilon_{t-1} + \cdots \\ Y_t =& \Psi (\mathbb{L}) X_t + N(\mathbb{L})\varepsilon_t. \end{align} \]

Cross-Correlation Function (CCF)

  • Cross-correlation function between the input and output. (Autocorrelation function for univariate time series.)

  • If \((X_t, Y_t)\) are stationary, then cross-covariance \[ \gamma_{XY} (k) =\mathbb{E} \left[(X_t - \mathbb{E}[X_t]) (Y_{t + k} - \mathbb{E}[Y_{t + k}]) \right]\]

  • CCF is \[ \rho_{XY} (k) = \frac{\gamma_{XY} (k)}{\sigma_X \sigma_Y}. \]

Note: In general, \(\rho_{XY} (k)\) is not equal to \(\rho_{XY}(-k)\). However \(\gamma_X (k)\), as the covariance between \(X_t\) and \(X_{t-k}\), is symmetric \(\gamma_X (k)=\gamma_X (-k)\).

Example

set.seed(2018); X = arima.sim(n=1000, list(ar=0.2,ma=0))
Y=rep(0,1000); Y[1] = X[1];
for (t in 2:1000){Y[t] = 2 + 0.5*Y[t-1] + 5*X[t] + 2*X[t-1] + rnorm(1)}
ccf(Y,X)

Example

acf(X)

Example

acf(Y)

Spurious Regression

  • In static case, one regression of \(Y\) on \(X\) often implies a causal effect. (Changes of \(X\) causes the changes of \(Y\).)

  • Does high correlation imply causal effects between two time series?

  • For time series variables, an apparent causality could be caused by some underlying common trends, especially the unit roots.

  • In this case, the standard regression analysis doesn’t work.

Spurious Regression

  • Suppose \(Y_t = C(\mathbb{L})\varepsilon_t\) and \(X_t = C(\mathbb{L})e_t\) where \(e_t\) and \(\varepsilon_t\) are two independent white noises.

  • The common term \(C(\mathbb{L})\) in \(Y_t\) and \(X_t\) can induce a significant cross-correlation.

  • Let \(C(\mathbb{L})= 1- \mathbb{L}\), then a unit root exists for both \(Y_t\) and \(X_t\).

  • In the other words, two indpendent processes \(X_t\) and \(Y_t\) may have high cross-correlation value when there exists some transfer functions.

Example

x = rnorm(100); y = rnorm(100); cor(x,y)
## [1] 0.07911594
for(i in 2:100) {
      x[i] = x[i-1] + rnorm(1)
      y[i] = y[i-1] + rnorm(1)}; cor(x,y)
## [1] -0.7669698
summary(lm(y~ x))$coef
##              Estimate Std. Error   t value     Pr(>|t|)
## (Intercept) 12.473138 1.41098050   8.84005 3.968860e-14
## x           -1.031262 0.08715624 -11.83234 1.364193e-20

Example

plot.ts(x, ylim=c(-30,30)); lines(y)

Example

Example

X = arima.sim(n=1000, list(order = c(1,1,0), ar = 0.7))
Y = arima.sim(n=1000, list(order = c(0,1,1), ma = 0.3))
cor(X,Y)
## [1] -0.4342586
summary(lm(Y~ X))$coef
##               Estimate Std. Error    t value     Pr(>|t|)
## (Intercept) -5.2096738 2.20921501  -2.358156 1.855786e-02
## X           -0.3055914 0.02005546 -15.237317 2.694838e-47

Example

ccf(Y,X)

Identify the Transfer

  • When \(X_t\) and \(Y_t\) are two independent processes, the correlation is induced by the transfer.

  • Linear regression is not available. One needs to identify the transfer function \[Y_t= \Psi(\mathbb{L})X_t + N_t = \delta^{-1}(\mathbb{L}) \omega(\mathbb{L}) X_t + N_t\] where the system is corrupted by noise \(N_t\), \(Y_t\) and \(X_t\) are ARIMA processes.

  • Note \(\delta(\mathbb{L}) = 1 - \delta_1 \mathbb{L} -\cdots - \delta_r \mathbb{L}^r\) and \(\omega(\mathbb{L}) = 1 - \omega_1 \mathbb{L} -\cdots - \omega_s \mathbb{L}^s\)

  • Our target: identify \(\delta_1,\dots \delta_r\) and \(\omega_1,\dots, \omega_s\).

Example

\((1 - 0.5\mathbb{L}) Y_t = 2 + 5 X_t + 2X_{t-1} + \varepsilon_t\) or
\(Y_t = 4 + \frac{(5 + 2\mathbb{L})}{(1 - 0.5\mathbb{L})} X_t + N_t\)

set.seed(2018); X = arima.sim(n=1000, list(ar=0.2,ma=0))
Y=rep(0,1000); Y[1] = X[1];
for (t in 2:1000){Y[t] = 2 + 0.5*Y[t-1] + 5*X[t] + 2*X[t-1] + rnorm(1)}
summary(lm(Y~ 1 + X + lag(X)))$coef
##             Estimate Std. Error  t value      Pr(>|t|)
## (Intercept) 3.911712  0.1870019 20.91803  7.432406e-81
## X           5.866740  0.1801060 32.57381 3.906933e-159

Example

\(Y_t = 4 + \frac{(5 + 2\mathbb{L})}{(1 - 0.5\mathbb{L})} X_t + N_t\)

library(MTS) # MTS: multivariate time series
transf = tfm1(Y,X, orderX= c(1,1,0), orderN=c(0,0,0))
## Delay:  0 
## Transfer function coefficients & s.e.: 
## in the order: constant, omega, and delta: 1 2 1 
##        [,1]   [,2]   [,3]    [,4]
## v    3.9142 5.0061 1.9735 0.49791
## se.v 0.0368 0.0361 0.0569 0.00482
# orderX (r,s,b) r and s are the degrees of denominator 
# and numerator polynomials and b is the delay

Example

acf(transf$residuals)

Example

ccf(transf$residuals, X)

Summary

  • Detrend a trending process.

  • Test the existence of unit root.

  • Relation between two time series.

  • Identify the transfer function of this relation.