Applied Time Series

2018/11/19

Outline

Last lecture:

Test and removal of non-stationarities.
Identify the transfer function for two time series

This lecture:

Cointegration
Vector AR

Cointegration

We have seen: two independent processes with a significant correlation: \((1- \mathbb{L}) Y_t = \varepsilon_t\) and \((1- \mathbb{L}) X_t = e_t\) where \(e_t\) and \(\varepsilon_t\) are independent.
However, there is also the case that two series containing unit roots are related. Such series are said to be cointegrated.
Two non-stationary time series \(X_t\) and \(Y_t\) are cointegrated if some linear combination \(a X_t + b Y_t\), with constant \(a\) and \(b\), is a stationary series.
Cointegrated system is a stable but non-stationary system: \[Y_t = -\frac{a}{b}X_t + \mbox{Stationary components}.\]

Cointegration

Phillips-Ouliaris test is a test of the null hypothesis that the two series are not cointegrated. (\(H_0\): Two series are not cointegrated)

library(tseries); x = rep(0,100); y = x; set.seed(2018)
for(i in 2:100) {x[i] = x[i-1] + rnorm(1); y[i] = y[i-1] + rnorm(1)}
cor(x,y); po.test(cbind(x,-y))

## [1] 0.3856345

## 
##  Phillips-Ouliaris Cointegration Test
## 
## data:  cbind(x, -y)
## Phillips-Ouliaris demeaned = -5.8011, Truncation lag parameter =
## 0, p-value = 0.15

Cointegration

Warning: PO test is not always reliable. For example, see the obvious problematic report given below:

x = rep(0,100); y = x; set.seed(2019)
for(i in 2:100) {x[i] = x[i-1] + rnorm(1); y[i] = y[i-1] + rnorm(1)}
cor(x,y); po.test(cbind(x,-y))

## [1] 0.8121847

## 
##  Phillips-Ouliaris Cointegration Test
## 
## data:  cbind(x, -y)
## Phillips-Ouliaris demeaned = -22.704, Truncation lag parameter =
## 0, p-value = 0.03333

Cointegration

\(H_0\): Two series are not cointegrated

X = arima.sim(list(order = c(1,1,0), ar = 0.5), n = 250)
Y = 2 + 5 *X + 4*rnorm(250)
po.test(cbind(5*X,Y))

## 
##  Phillips-Ouliaris Cointegration Test
## 
## data:  cbind(5 * X, Y)
## Phillips-Ouliaris demeaned = -263.66, Truncation lag parameter =
## 2, p-value = 0.01

Cointegration

\(H_0\): Two series are not cointegrated

X = Y = m = rep(0,250)
for (i in 2:250){ m[i] = m[i-1] + rnorm(1) }
X = m + rnorm(250); Y = m + rnorm(250)
po.test(cbind(X,Y))

## 
##  Phillips-Ouliaris Cointegration Test
## 
## data:  cbind(X, Y)
## Phillips-Ouliaris demeaned = -234.89, Truncation lag parameter =
## 2, p-value = 0.01

Commodities

Gold and Oil are deemed to have a complementary correlation. (When crude oil prices increase, inflation also rises. Gold is known to be a good hedge against inflation.)
Is this co-movement caused by a co-integrated system?

library(Quandl); library(xts); library(lubridate); library(forecast); 
library(lmtest); library(MTS)
gold =  Quandl("NASDAQOMX/NQCIGCER", type = "xts", 
     collapse = "daily", start_date="2016-01-01", end_date="2018-11-18")
oil =  Quandl("NASDAQOMX/NQCICLER", type = "xts", 
     collapse = "daily", start_date="2016-01-01", end_date="2018-11-18")
gold = xts(gold$"Index Value"); oil = xts(oil$"Index Value")

Commodities

plot(cbind(oil,gold-500), col = c("blue","black") )
legend(x = "topleft", legend = c("Oil", "Gold (minus 500USD)"), 
       lty = 1,col = c("blue","black"))

Commodities

acf(gold)

Commodities

diff.gold = diff(gold); diff.gold=na.omit(diff.gold); acf(diff.gold)

Commodities

acf(oil)

Commodities

diff.oil = diff(oil); diff.oil=na.omit(diff.oil); acf(diff.oil)

Commodities

adf.test(diff.gold)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  diff.gold
## Dickey-Fuller = -9.2131, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary

adf.test(diff.oil)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  diff.oil
## Dickey-Fuller = -8.7708, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary

Commodities

cor(gold,oil)

##             Index Value
## Index Value  -0.1945119

po.test(cbind(oil,-2*gold))

## 
##  Phillips-Ouliaris Cointegration Test
## 
## data:  cbind(oil, -2 * gold)
## Phillips-Ouliaris demeaned = -7.3387, Truncation lag parameter =
## 7, p-value = 0.15

Commodities

coeftest(lm(oil~1+ gold))

## 
## t test of coefficients:
## 
##               Estimate Std. Error t value  Pr(>|t|)    
## (Intercept) 508.734331  36.959304 13.7647 < 2.2e-16 ***
## gold         -0.261057   0.048927 -5.3357 1.274e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Commodities

coeftest(lm(diff.oil~1+ diff.gold))

## 
## t test of coefficients:
## 
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.049795   0.227881  0.2185   0.8271
## diff.gold   0.011545   0.037593  0.3071   0.7588

Commodities

Get a closer look by using ARIMA models and transfer functions.

auto.arima(gold) # automatically select the suitable ARIMA model

## Series: gold 
## ARIMA(0,1,0)                    
## 
## sigma^2 estimated as 36.75:  log likelihood=-2335.19
## AIC=4672.39   AICc=4672.39   BIC=4676.97

Commodities

auto.arima(oil)

## Series: oil 
## ARIMA(2,1,1)                    
## 
## Coefficients:
##          ar1     ar2      ma1
##       0.6619  0.0527  -0.7024
## s.e.  0.3114  0.0374   0.3101
## 
## sigma^2 estimated as 37.6:  log likelihood=-2342
## AIC=4692   AICc=4692.06   BIC=4710.35

Commodities

Given the series of gold (\(G_t\)) and oil (\(O_t\)), ARIMA models are: \[(1-\mathbb{L})G_t = v_{t} \\ \phi(\mathbb{L})(1-\mathbb{L})O_t = \theta(\mathbb{L}) \varepsilon_{t}\] while \(v_{t}\) and \(\varepsilon_{t}\) may be correlated.
Replace \(v_{t}\) with \(\varepsilon_{t}= \theta(\mathbb{L})^{-1} \phi(\mathbb{L})(1-\mathbb{L})O_t\), then the transfer function representation is \[G_t = \theta(\mathbb{L})^{-1} \phi(\mathbb{L}) O_t + N_t\] where \(r\)-order polynomial \(\theta(z)\) is \(1\) and \(s\)-order polynomial \(\phi(z)\) is \(2\).

Commodities

tfm1(gold, oil, orderX= c(1,2,0), orderN=c(0,0,0))

## Delay:  0 
## Transfer function coefficients & s.e.: 
## in the order: constant, omega, and delta: 1 3 1 
##        [,1]  [,2]   [,3]   [,4]  [,5]
## v    803.17 0.204 0.0679 -0.397 0.193
## se.v   8.48 0.208 0.3194  0.220 0.164

\(G_t = 803 + \frac{(0.2 + 0.06\mathbb{L} - 0.4\mathbb{L}^2)}{(1 - 0.2\mathbb{L})} O_t + N_t\)
t-statistic for \(|\frac{-0.39 -0}{0.22}| \approx 1.78\) for a sample larger than \(600\).

Beyond Two Series

In a system of many series, a simultaneously movement of two related series may be driven by some factors from the other series.
It is important to describe the joint movement patterns for essential series in this system.
For example, J. Tinbergen (1936 Dutch economy, 1939 US economy): 24 - 48 equations
Tinbergen thinks about capturing the joint behavior quantitatively through the specification of a system of equations, followed by the estimation of their parameters.

Beyond Two Series

Macroeconomic system：describes the joint behavior of a number of aggregate variables. Moreover, these varaibles are correlated and may have simultaneous dynamics.
Tinbergen’s equations: groups (labor and others), commodities (consumer and investment), income (consumption and saving), time dummies (earning and receiving), stages (finished goods and raw materials), economies (NL/US and the others)
C. Sims chose 9 variables suggested by Litterman (A Nine Variable Probabilistic Macroeconomic Forecasting Model): M1, GDP Deflator, GDP, fixed investment, unemployment rate, dollar index, S&P 500 index, CPI, Treasury Bill.

Nine Variables in the U.S. Economy

3-Month Treasury Bill: Secondary Market Rate (FRED/TB3MS)
M1: Currency Component of M1 (FRED/CURRNS)
GDP Implicit Price Deflator: inflation (FRED/GDPDEF)
GDP: economic output (FRED/GDP)
Real business fixed investment: Gross Private Domestic Investment (FRED/GPDI)
Unemployment: Civilian Unemployment Rate (FRED/UNRATE)
Dollar index: Trade weighted value of dollar (FRED/TWEXBMTH)
S&P 500 index: equity market valuation
Commodity price index: CPI Index All Urban Customers All Items (FRED/CPIAUCSL)

Data

library(Quandl); library(xts); library(lubridate); library(forecast)

econIndicators = c("Unemployment Rate" = "FRED/UNRATE", 
"Funds" = "FRED/FEDFUNDS", "Treasury Bill" = "FRED/TB3MS", 
"CPI" = "FRED/CPIAUCSL", "M1" = "FRED/CURRNS", "Deflator" = "FRED/GDPDEF", 
"GDP" = "FRED/GDP", "Investment" = "FRED/GPDI", 
"Dollar Index" = "FRED/TWEXBMTH")

US_ALL = Quandl(econIndicators, type="xts", collapse = "quarterly",
                start_date="1948-01-01", end_date="2018-09-01")
colnames(US_ALL)= names(econIndicators)

Data

A Simple Vector AR

AR(p) : \[Y_{t}=\phi_{1}Y_{t-1}+\cdots+\phi_{p}Y_{t-p}+\varepsilon_{t},\quad\varepsilon_{t}\sim WN(0,\sigma^{2})\]
Write AR(p) as 1st order VAR \[\left[\begin{array}{c} Y_{t}\\ Y_{t-1}\\ Y_{t-2}\\ \vdots\\ Y_{t-p+1} \end{array}\right]=\left[\begin{array}{ccccc} \phi_{1} & \phi_{2} & \cdots & \cdots & \phi_{p}\\ 1 & 0 & \cdots & \cdots & 0\\ 0 & 1 & \ddots & \cdots & 0\\ \vdots & \vdots & \ddots & \cdots & \vdots\\ 0 & 0 & \cdots & 1 & 0 \end{array}\right]\left[\begin{array}{c} Y_{t-1}\\ Y_{t-2}\\ Y_{t-3}\\ \vdots\\ Y_{t-p} \end{array}\right]+\left[\begin{array}{c} \varepsilon_{t}\\ 0\\ 0\\ \vdots\\ 0 \end{array}\right]\]

or \(\underset{(p\times1)}{\mathbf{Y}_{t}}=\underset{(p\times p)}{\mathbf{\Phi}}\underset{(p\times1)}{\mathbf{Y}_{t-1}}+\underset{(p\times1)}{\mathbf{e}_t}\)

ARMA(2,1) as VAR(1)

Consider : \[Y_{t}=\phi_{1}Y_{t-1}+ \phi_{2}Y_{t-2}+ \varepsilon_{t} + \theta_1 \varepsilon_{t-1}.\]
The VAR representation is \[\left[\begin{array}{c} Y_{t}\\ Y_{t-1}\\ \varepsilon_{t}\\ \end{array}\right]=\left[\begin{array}{ccc} \phi_{1} & \phi_{2} & \theta_{1}\\ 1 & 0 & 0\\ 0 & 0 & 0\\ \end{array}\right]\left[\begin{array}{c} Y_{t-1}\\ Y_{t-2}\\ \varepsilon_{t-1}\\ \end{array}\right]+\left[\begin{array}{c} 1\\ 0\\ 1 \end{array} \right] \varepsilon_{t}.\]

Sometimes it more convenient to use the following form \[\underset{(3 \times1)}{\mathbf{Y}_{t}}=\underset{(3\times 3)}{\mathbf{\Phi}}\underset{(3\times1)}{\mathbf{Y}_{t-1}}+\underset{(3\times 3)}{\mathbf{C}} \underset{(3\times1)}{\mathbf{w}_t}\] where \(\mathbf{C} = \mbox{diag}[\sigma, 0, \sigma],\, \mathbb{E}[\mathbf{w}_t \mathbf{w}_t^{T}]= \mathbf{I}\) instead of using \(\mathbf{e}_t\).

Simulation of VAR

Multivariate white noise: The pair \(\{e_{1,t}, e_{2,t} \}\) are multivariate white noise if each individual series is white noise and the CCF for all \(k\neq 0\) \[\gamma_{e_1, e_2}(k)=\mbox{Cov}(e_{1,t}, e_{2,t+k}) = 0.\]
This argument can be extended to \(N\)-number of series \(\{e_{1,t}, \dots, e_{N,t}\}\).
It means the covariance of any two series in \(\{e_{1,t}, \dots, e_{N,t}\}\) at the same time \(t\) can be non-zero \(\mbox{Cov}(e_{i,t}, e_{j,t})\neq 0\).
An \(N\times N\) covariance matrix for the vector \(\mathbf{e}_t = [e_{1,t}, \dots, e_{N,t}]^T\).

Simulation of VAR

library(mvtnorm)
covmat = matrix(c(1,0.5,0.5,1) , nr=2)
e = rmvnorm(1000, sigma = covmat)
cov(e)

##           [,1]      [,2]
## [1,] 1.0451912 0.5690965
## [2,] 0.5690965 1.0639452

Simulation of VAR

e1 = e[,1]; e2 = e[,2]; ccf(e1,e2)

Simulation of VAR

Consider \[Y_{1, t}=0.4 Y_{1,t-1}+ 0.3 Y_{2, t-1}+ e_{1,t} \\ Y_{2, t}=0.2 Y_{1,t-1}+ 0.1 Y_{2, t-1}+ e_{2,t}.\]

y1 = y2 = rep(0, 1000); y1[1] = e1[1]; y2[1] = e2[1] 
for (t in 2:1000){
  y1[t] = 0.4*y1[t-1] + 0.3*y2[t-1] + e1[t]
  y2[t] = 0.2*y1[t-1] + 0.1*y2[t-1] + e2[t]
}

VAR

VAR(cbind(y1,y2), p=1, include.mean = FALSE)

## AR coefficient matrix 
## AR( 1 )-matrix 
##       [,1]  [,2]
## [1,] 0.382 0.332
## [2,] 0.205 0.129
## standard error 
##        [,1]   [,2]
## [1,] 0.0319 0.0372
## [2,] 0.0322 0.0375
##   
## Residuals cov-mtx: 
##           [,1]      [,2]
## [1,] 1.0445671 0.5692354
## [2,] 0.5692354 1.0654107
##   
## det(SSE) =  0.788864 
## AIC =  -0.2291614 
## BIC =  -0.2095304 
## HQ  =  -0.2217002

Summary

Cointegration of two series.
An empirical example.
An economic system with many interacting series.
A Matrix representation of these series.