Applied Time Series

11/12/2017

Outline

Last lecture:

ARIMA: removal of non-stationary patterns (unit roots and seasonality) recovers stationarity.

This lecture:

Extracting trends, checking unit root.
Study a system of two time series.

Linear Filter Models

We knew: a time series \(Y_t\) is generated from a series of independent noises \(\varepsilon_t\).
In other words, a noise process \(\varepsilon_t\) is transformed to the process \(Y_t\) by a so-called linear filter \[Y_t = \Psi(\mathbb{L}) \varepsilon_t\] where \(\Psi (\cdot)\) is called the transfer function of the filter.

Linear Filter Models

ARIMA(p,1,q) model \[\phi(\mathbb{L})(1-\mathbb{L}) Y_t =\theta(\mathbb{L})\varepsilon_t \\ \mbox{or: } (1-\mathbb{L}) Y_t = \phi(\mathbb{L})^{-1}\theta(\mathbb{L})\varepsilon_t \] where \((1-\mathbb{L})\) is the difference filter, \(\theta(\mathbb{L})\) is the MA filter, and \(\phi(\mathbb{L})\) is the AR filter.
Similar, for seasonal ARIMA\((p,1,q)(P, 1, Q)_{d}\): \[ \phi (\mathbb{L}) \Phi (\mathbb{L}^d) (1 - \mathbb{L}) (1 - \mathbb{L}^d) Y_t = \theta (\mathbb{L}) \Theta (\mathbb{L}^d) \varepsilon_t\] where \(\Phi (\mathbb{L}^d)\), \((1 - \mathbb{L}^d)\), \(\Theta (\mathbb{L}^d)\) are seasonal filters.

Linear Filter Models

The linear filter is an additive linear function of previous (or future) variables.
If the filter only uses previous (lagged) variables, then it is a one side filter. AR, MA filters are one side filter.
One naive two sides filter (\(S(\mathbb{L})\)): \[S(\mathbb{L})Y_t = \sum_{j=-k}^{k} a_j Y_{t-j},\] \(k\) is the periodicity of the seasonality. When \(k=6\) is to remove the annual cycle in monthly data \(S(\mathbb{L})Y_t = \sum_{j=−6}^{6} a_j Y_{t−j}\) where \(a_j = 1/12\) for \(j = 0, \pm 1,..., \pm 5\) and \(a_{\pm 6} = 1/24\).

One Naive Filter

library(Quandl); library(xts); library(lubridate); library(forecast)
Bel =  Quandl("ECB/STS_M_BE_N_UNEH_LTT000_4_000", type = "xts", 
              collapse = "monthly", start_date="2000-01-31")
a = rep(1,11)
a = c(0.5, a, 0.5)
a = a/sum(a)   
trend = filter(Bel, sides=2, a)

One Naive Filter

ts.plot(Bel); lines(trend, col="red")

Detrended Series

ts.plot(Bel - trend)

Unit Root Test

Unit root test is to determine if a stochastic trend is present. The most popular test is the augmented Dickey-Fuller (ADF) test.
Suppose that \(Y_{t}\) is difference stationary: \[\nabla Y_{t} = (1- \mathbb{L})Y_t = \mbox{Stationary Components}. \]
The intuition of ADF test is to express the original process with \(p\) number of stationary components \(\nabla Y_{t-j}\), \(j=1,\dots, p\).

Augmented Dickey-Fuller Test

ADF tests use a parametric autoregressive structure to capture serial correlation
Test regression \[Y_{t}= \beta D_{t}+\phi Y_{t-1}+\sum_{j=1}^{p}\psi_{j}\nabla Y_{t-j}+\varepsilon_{t}\] \[D_{t}= \mbox{Deterministic terms}\] \[\nabla Y_{t-j}\: \mbox{captures serial correlation}\]
\(\nabla Y_{t-j}\) are stationary. If \(Y_t\) is I(1), then it is the case only if \(\phi=1\).

Augmented Dickey-Fuller Test

ADF t-statistic is \[\mbox{ADF}_{t}= t_{\phi=1}=\frac{\hat{\phi}-1}{\mbox{s.e.}(\hat{\phi})}\]
ADF test tests the null hypothesis that a time series \(Y_t\) is I(1) against the alternative that it is I(0): \[H_{0}: \quad\phi=1\quad(\phi(z)=0\mbox{ has a unit root})\] \[H_{1}: \quad|\phi|<1\quad(\phi(z)=0\mbox{ has roots outside the unit circle})\]

Simulation

library(tseries)
y = arima.sim(n=1000,list(order = c(0, 1, 0)))
adf.test(y)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  y
## Dickey-Fuller = -2.2119, Lag order = 9, p-value = 0.4886
## alternative hypothesis: stationary

Belgian Data

adf.test(Bel)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  Bel
## Dickey-Fuller = -2.2102, Lag order = 6, p-value = 0.4877
## alternative hypothesis: stationary

adf.test(na.omit(diff(Bel))) # difference filter

## 
##  Augmented Dickey-Fuller Test
## 
## data:  na.omit(diff(Bel))
## Dickey-Fuller = -6.7769, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary

System of Equations

ARIMA: a single variable model.
A system: joint behavior of multiple variables.
Multiple time series.
It is only when the dynamic characteristics of a system are understood that intelligent direction, manipulation, and control of the system become possible.

A Simple System

In static cases, regression explores the relationship between two variables. Predictor variables either directly cause the response or provide a plausible explanation of it.
We can represent a dynamic relationship connecting two series by a linear filter.
Pairs of time series \((X_t, Y_t)\), e.g. a production system with one input resource \(X\) and one output product \(Y\). A steady-state relationship \[ Y_t = \Psi (\mathbb{L}) X_t \] \(Y_t\) is a function of \(X_t\). The filter \(\Psi (\mathbb{L})\) is called the steady-state gain.

A Simple System

Consider a linear filter of the form: \[ \begin{align} Y_t =& \psi_0 X_t + \psi_1 X_{t-1} +\cdots \\ =& (\psi_0 + \psi_1 \mathbb{L} + \psi_2 \mathbb{L}^2 +\cdots) X_t = \Psi (\mathbb{L}) X_t \end{align} \]
\(\Psi (\cdot )\) is the transfer function of the filter.
\(\psi_0,\psi_1,\psi_2, \dots\) are the impulse response function of the system. (Recall the case in AR models.)
\(\psi_j\) may be regarded as the \(Y\)’s response at times \(j\) to a unit pulse input of \(X_{t-j}\) at time \(t-j\).

System with Two Series

A more general form \[ \begin{align} \delta_0 Y_t + \delta_1 Y_{t-1} + \cdots =& \omega_0 X_t + \omega_1 X_{t-1} +\cdots \\ \delta(\mathbb{L}) Y_t= & \omega(\mathbb{L}) X_t \\ Y_t =& \delta^{-1}(\mathbb{L}) \omega(\mathbb{L}) X_t = \Psi(\mathbb{L}) X_t \end{align} \]
A more more general form \[ \begin{align} Y_t= & \delta^{-1}(\mathbb{L}) \omega(\mathbb{L}) X_t + n_0 \varepsilon_t + n_1 \varepsilon_{t-1} + \cdots \\ Y_t =& \Psi (\mathbb{L}) X_t + N(\mathbb{L})\varepsilon_t. \end{align} \]

Cross-Correlation Function (CCF)

Cross-correlation function between the input and output. (Autocorrelation function for univariate time series.)
If \((X_t, Y_t)\) are stationary, then cross-covariance \[ \gamma_{XY} (k) =\mathbb{E} \left[(X_t - \mathbb{E}[X_t]) (Y_{t + k} - \mathbb{E}[Y_{t + k}]) \right]\]
CCF is \[ \rho_{XY} (k) = \frac{\gamma_{XY} (k)}{\sigma_X \sigma_Y}. \]

Note: In general, \(\rho_{XY} (k)\) is not equal to \(\rho_{XY}(-k)\). However \(\gamma_X (k)\), as the covariance between \(X_t\) and \(X_{t-k}\), is symmetric \(\gamma_X (k)=\gamma_X (-k)\).

Example

set.seed(2018); X = arima.sim(n=1000, list(ar=0.2,ma=0))
Y=rep(0,1000); Y[1] = X[1];
for (t in 2:1000){Y[t] = 2 + 0.5*Y[t-1] + 5*X[t] + 2*X[t-1] + rnorm(1)}
ccf(Y,X)

Example

acf(X)

Example

acf(Y)

Spurious Regression

In static case, one regression of \(Y\) on \(X\) often implies a causal effect. (Changes of \(X\) causes the changes of \(Y\).)
Does high correlation imply causal effects between two time series?
For time series variables, an apparent causality could be caused by some underlying common trends, especially the unit roots.
In this case, the standard regression analysis doesn’t work.

Spurious Regression

Suppose \(Y_t = C(\mathbb{L})\varepsilon_t\) and \(X_t = C(\mathbb{L})e_t\) where \(e_t\) and \(\varepsilon_t\) are two independent white noises.
The common term \(C(\mathbb{L})\) in \(Y_t\) and \(X_t\) can induce a significant cross-correlation.
Let \(C(\mathbb{L})= 1- \mathbb{L}\), then a unit root exists for both \(Y_t\) and \(X_t\).
In the other words, two indpendent processes \(X_t\) and \(Y_t\) may have high cross-correlation value when there exists some transfer functions.

Example

x = rnorm(100); y = rnorm(100); cor(x,y)

## [1] 0.07911594

for(i in 2:100) {
      x[i] = x[i-1] + rnorm(1)
      y[i] = y[i-1] + rnorm(1)}; cor(x,y)

## [1] -0.7669698

summary(lm(y~ x))$coef

##              Estimate Std. Error   t value     Pr(>|t|)
## (Intercept) 12.473138 1.41098050   8.84005 3.968860e-14
## x           -1.031262 0.08715624 -11.83234 1.364193e-20

Example

plot.ts(x, ylim=c(-30,30)); lines(y)

Example

X = arima.sim(n=1000, list(order = c(1,1,0), ar = 0.7))
Y = arima.sim(n=1000, list(order = c(0,1,1), ma = 0.3))
cor(X,Y)

## [1] -0.4342586

summary(lm(Y~ X))$coef

##               Estimate Std. Error    t value     Pr(>|t|)
## (Intercept) -5.2096738 2.20921501  -2.358156 1.855786e-02
## X           -0.3055914 0.02005546 -15.237317 2.694838e-47

Example

ccf(Y,X)

Identify the Transfer

When \(X_t\) and \(Y_t\) are two independent processes, the correlation is induced by the transfer.
Linear regression is not available. One needs to identify the transfer function \[Y_t= \Psi(\mathbb{L})X_t + N_t = \delta^{-1}(\mathbb{L}) \omega(\mathbb{L}) X_t + N_t\] where the system is corrupted by noise \(N_t\), \(Y_t\) and \(X_t\) are ARIMA processes.
Note \(\delta(\mathbb{L}) = 1 - \delta_1 \mathbb{L} -\cdots - \delta_r \mathbb{L}^r\) and \(\omega(\mathbb{L}) = 1 - \omega_1 \mathbb{L} -\cdots - \omega_s \mathbb{L}^s\)
Our target: identify \(\delta_1,\dots \delta_r\) and \(\omega_1,\dots, \omega_s\).

Example

\((1 - 0.5\mathbb{L}) Y_t = 2 + 5 X_t + 2X_{t-1} + \varepsilon_t\) or
\(Y_t = 4 + \frac{(5 + 2\mathbb{L})}{(1 - 0.5\mathbb{L})} X_t + N_t\)

set.seed(2018); X = arima.sim(n=1000, list(ar=0.2,ma=0))
Y=rep(0,1000); Y[1] = X[1];
for (t in 2:1000){Y[t] = 2 + 0.5*Y[t-1] + 5*X[t] + 2*X[t-1] + rnorm(1)}
summary(lm(Y~ 1 + X + lag(X)))$coef

##             Estimate Std. Error  t value      Pr(>|t|)
## (Intercept) 3.911712  0.1870019 20.91803  7.432406e-81
## X           5.866740  0.1801060 32.57381 3.906933e-159

Example

\(Y_t = 4 + \frac{(5 + 2\mathbb{L})}{(1 - 0.5\mathbb{L})} X_t + N_t\)

library(MTS) # MTS: multivariate time series
transf = tfm1(Y,X, orderX= c(1,1,0), orderN=c(0,0,0))

## Delay:  0 
## Transfer function coefficients & s.e.: 
## in the order: constant, omega, and delta: 1 2 1 
##        [,1]   [,2]   [,3]    [,4]
## v    3.9142 5.0061 1.9735 0.49791
## se.v 0.0368 0.0361 0.0569 0.00482

# orderX (r,s,b) r and s are the degrees of denominator 
# and numerator polynomials and b is the delay

Example

acf(transf$residuals)

Example

ccf(transf$residuals, X)

Summary

Detrend a trending process.
Test the existence of unit root.
Relation between two time series.
Identify the transfer function of this relation.