2018/11/19

Outline

Last lecture:

  • Test and removal of non-stationarities.

  • Identify the transfer function for two time series

This lecture:

  • Cointegration

  • Vector AR

Cointegration

  • We have seen: two independent processes with a significant correlation: \((1- \mathbb{L}) Y_t = \varepsilon_t\) and \((1- \mathbb{L}) X_t = e_t\) where \(e_t\) and \(\varepsilon_t\) are independent.

  • However, there is also the case that two series containing unit roots are related. Such series are said to be cointegrated.

  • Two non-stationary time series \(X_t\) and \(Y_t\) are cointegrated if some linear combination \(a X_t + b Y_t\), with constant \(a\) and \(b\), is a stationary series.

  • Cointegrated system is a stable but non-stationary system: \[Y_t = -\frac{a}{b}X_t + \mbox{Stationary components}.\]

Cointegration

  • Phillips-Ouliaris test is a test of the null hypothesis that the two series are not cointegrated. (\(H_0\): Two series are not cointegrated)
library(tseries); x = rep(0,100); y = x; set.seed(2018)
for(i in 2:100) {x[i] = x[i-1] + rnorm(1); y[i] = y[i-1] + rnorm(1)}
cor(x,y); po.test(cbind(x,-y))
## [1] 0.3856345
## 
##  Phillips-Ouliaris Cointegration Test
## 
## data:  cbind(x, -y)
## Phillips-Ouliaris demeaned = -5.8011, Truncation lag parameter =
## 0, p-value = 0.15

Cointegration

  • Warning: PO test is not always reliable. For example, see the obvious problematic report given below:
x = rep(0,100); y = x; set.seed(2019)
for(i in 2:100) {x[i] = x[i-1] + rnorm(1); y[i] = y[i-1] + rnorm(1)}
cor(x,y); po.test(cbind(x,-y))
## [1] 0.8121847
## 
##  Phillips-Ouliaris Cointegration Test
## 
## data:  cbind(x, -y)
## Phillips-Ouliaris demeaned = -22.704, Truncation lag parameter =
## 0, p-value = 0.03333

Cointegration

  • \(H_0\): Two series are not cointegrated
X = arima.sim(list(order = c(1,1,0), ar = 0.5), n = 250)
Y = 2 + 5 *X + 4*rnorm(250)
po.test(cbind(5*X,Y))
## 
##  Phillips-Ouliaris Cointegration Test
## 
## data:  cbind(5 * X, Y)
## Phillips-Ouliaris demeaned = -263.66, Truncation lag parameter =
## 2, p-value = 0.01

Cointegration

  • \(H_0\): Two series are not cointegrated
X = Y = m = rep(0,250)
for (i in 2:250){ m[i] = m[i-1] + rnorm(1) }
X = m + rnorm(250); Y = m + rnorm(250)
po.test(cbind(X,Y))
## 
##  Phillips-Ouliaris Cointegration Test
## 
## data:  cbind(X, Y)
## Phillips-Ouliaris demeaned = -234.89, Truncation lag parameter =
## 2, p-value = 0.01

Commodities

  • Gold and Oil are deemed to have a complementary correlation. (When crude oil prices increase, inflation also rises. Gold is known to be a good hedge against inflation.)

  • Is this co-movement caused by a co-integrated system?

library(Quandl); library(xts); library(lubridate); library(forecast); 
library(lmtest); library(MTS)
gold =  Quandl("NASDAQOMX/NQCIGCER", type = "xts", 
     collapse = "daily", start_date="2016-01-01", end_date="2018-11-18")
oil =  Quandl("NASDAQOMX/NQCICLER", type = "xts", 
     collapse = "daily", start_date="2016-01-01", end_date="2018-11-18")
gold = xts(gold$"Index Value"); oil = xts(oil$"Index Value")

Commodities

plot(cbind(oil,gold-500), col = c("blue","black") )
legend(x = "topleft", legend = c("Oil", "Gold (minus 500USD)"), 
       lty = 1,col = c("blue","black"))

Commodities

acf(gold)

Commodities

diff.gold = diff(gold); diff.gold=na.omit(diff.gold); acf(diff.gold)

Commodities

acf(oil)

Commodities

diff.oil = diff(oil); diff.oil=na.omit(diff.oil); acf(diff.oil)

Commodities

adf.test(diff.gold)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  diff.gold
## Dickey-Fuller = -9.2131, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
adf.test(diff.oil)
## 
##  Augmented Dickey-Fuller Test
## 
## data:  diff.oil
## Dickey-Fuller = -8.7708, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary

Commodities

cor(gold,oil)
##             Index Value
## Index Value  -0.1945119
po.test(cbind(oil,-2*gold))
## 
##  Phillips-Ouliaris Cointegration Test
## 
## data:  cbind(oil, -2 * gold)
## Phillips-Ouliaris demeaned = -7.3387, Truncation lag parameter =
## 7, p-value = 0.15

Commodities

coeftest(lm(oil~1+ gold))
## 
## t test of coefficients:
## 
##               Estimate Std. Error t value  Pr(>|t|)    
## (Intercept) 508.734331  36.959304 13.7647 < 2.2e-16 ***
## gold         -0.261057   0.048927 -5.3357 1.274e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Commodities

coeftest(lm(diff.oil~1+ diff.gold))
## 
## t test of coefficients:
## 
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.049795   0.227881  0.2185   0.8271
## diff.gold   0.011545   0.037593  0.3071   0.7588

Commodities

  • Get a closer look by using ARIMA models and transfer functions.
auto.arima(gold) # automatically select the suitable ARIMA model
## Series: gold 
## ARIMA(0,1,0)                    
## 
## sigma^2 estimated as 36.75:  log likelihood=-2335.19
## AIC=4672.39   AICc=4672.39   BIC=4676.97

Commodities

auto.arima(oil)
## Series: oil 
## ARIMA(2,1,1)                    
## 
## Coefficients:
##          ar1     ar2      ma1
##       0.6619  0.0527  -0.7024
## s.e.  0.3114  0.0374   0.3101
## 
## sigma^2 estimated as 37.6:  log likelihood=-2342
## AIC=4692   AICc=4692.06   BIC=4710.35

Commodities

  • Given the series of gold (\(G_t\)) and oil (\(O_t\)), ARIMA models are: \[(1-\mathbb{L})G_t = v_{t} \\ \phi(\mathbb{L})(1-\mathbb{L})O_t = \theta(\mathbb{L}) \varepsilon_{t}\] while \(v_{t}\) and \(\varepsilon_{t}\) may be correlated.

  • Replace \(v_{t}\) with \(\varepsilon_{t}= \theta(\mathbb{L})^{-1} \phi(\mathbb{L})(1-\mathbb{L})O_t\), then the transfer function representation is \[G_t = \theta(\mathbb{L})^{-1} \phi(\mathbb{L}) O_t + N_t\] where \(r\)-order polynomial \(\theta(z)\) is \(1\) and \(s\)-order polynomial \(\phi(z)\) is \(2\).

Commodities

tfm1(gold, oil, orderX= c(1,2,0), orderN=c(0,0,0))
## Delay:  0 
## Transfer function coefficients & s.e.: 
## in the order: constant, omega, and delta: 1 3 1 
##        [,1]  [,2]   [,3]   [,4]  [,5]
## v    803.17 0.204 0.0679 -0.397 0.193
## se.v   8.48 0.208 0.3194  0.220 0.164
  • \(G_t = 803 + \frac{(0.2 + 0.06\mathbb{L} - 0.4\mathbb{L}^2)}{(1 - 0.2\mathbb{L})} O_t + N_t\)

  • t-statistic for \(|\frac{-0.39 -0}{0.22}| \approx 1.78\) for a sample larger than \(600\).

Beyond Two Series

  • In a system of many series, a simultaneously movement of two related series may be driven by some factors from the other series.

  • It is important to describe the joint movement patterns for essential series in this system.

  • For example, J. Tinbergen (1936 Dutch economy, 1939 US economy): 24 - 48 equations

  • Tinbergen thinks about capturing the joint behavior quantitatively through the specification of a system of equations, followed by the estimation of their parameters.

Beyond Two Series

  • Macroeconomic system:describes the joint behavior of a number of aggregate variables. Moreover, these varaibles are correlated and may have simultaneous dynamics.

  • Tinbergen’s equations: groups (labor and others), commodities (consumer and investment), income (consumption and saving), time dummies (earning and receiving), stages (finished goods and raw materials), economies (NL/US and the others)

  • C. Sims chose 9 variables suggested by Litterman (A Nine Variable Probabilistic Macroeconomic Forecasting Model): M1, GDP Deflator, GDP, fixed investment, unemployment rate, dollar index, S&P 500 index, CPI, Treasury Bill.

Nine Variables in the U.S. Economy

  • 3-Month Treasury Bill: Secondary Market Rate (FRED/TB3MS)
  • M1: Currency Component of M1 (FRED/CURRNS)
  • GDP Implicit Price Deflator: inflation (FRED/GDPDEF)
  • GDP: economic output (FRED/GDP)
  • Real business fixed investment: Gross Private Domestic Investment (FRED/GPDI)
  • Unemployment: Civilian Unemployment Rate (FRED/UNRATE)
  • Dollar index: Trade weighted value of dollar (FRED/TWEXBMTH)
  • S&P 500 index: equity market valuation
  • Commodity price index: CPI Index All Urban Customers All Items (FRED/CPIAUCSL)

Data

library(Quandl); library(xts); library(lubridate); library(forecast)

econIndicators = c("Unemployment Rate" = "FRED/UNRATE", 
"Funds" = "FRED/FEDFUNDS", "Treasury Bill" = "FRED/TB3MS", 
"CPI" = "FRED/CPIAUCSL", "M1" = "FRED/CURRNS", "Deflator" = "FRED/GDPDEF", 
"GDP" = "FRED/GDP", "Investment" = "FRED/GPDI", 
"Dollar Index" = "FRED/TWEXBMTH")

US_ALL = Quandl(econIndicators, type="xts", collapse = "quarterly",
                start_date="1948-01-01", end_date="2018-09-01")
colnames(US_ALL)= names(econIndicators)

Data

A Simple Vector AR

  • AR(p) : \[Y_{t}=\phi_{1}Y_{t-1}+\cdots+\phi_{p}Y_{t-p}+\varepsilon_{t},\quad\varepsilon_{t}\sim WN(0,\sigma^{2})\]

  • Write AR(p) as 1st order VAR \[\left[\begin{array}{c} Y_{t}\\ Y_{t-1}\\ Y_{t-2}\\ \vdots\\ Y_{t-p+1} \end{array}\right]=\left[\begin{array}{ccccc} \phi_{1} & \phi_{2} & \cdots & \cdots & \phi_{p}\\ 1 & 0 & \cdots & \cdots & 0\\ 0 & 1 & \ddots & \cdots & 0\\ \vdots & \vdots & \ddots & \cdots & \vdots\\ 0 & 0 & \cdots & 1 & 0 \end{array}\right]\left[\begin{array}{c} Y_{t-1}\\ Y_{t-2}\\ Y_{t-3}\\ \vdots\\ Y_{t-p} \end{array}\right]+\left[\begin{array}{c} \varepsilon_{t}\\ 0\\ 0\\ \vdots\\ 0 \end{array}\right]\]

or \(\underset{(p\times1)}{\mathbf{Y}_{t}}=\underset{(p\times p)}{\mathbf{\Phi}}\underset{(p\times1)}{\mathbf{Y}_{t-1}}+\underset{(p\times1)}{\mathbf{e}_t}\)

ARMA(2,1) as VAR(1)

  • Consider : \[Y_{t}=\phi_{1}Y_{t-1}+ \phi_{2}Y_{t-2}+ \varepsilon_{t} + \theta_1 \varepsilon_{t-1}.\]

  • The VAR representation is \[\left[\begin{array}{c} Y_{t}\\ Y_{t-1}\\ \varepsilon_{t}\\ \end{array}\right]=\left[\begin{array}{ccc} \phi_{1} & \phi_{2} & \theta_{1}\\ 1 & 0 & 0\\ 0 & 0 & 0\\ \end{array}\right]\left[\begin{array}{c} Y_{t-1}\\ Y_{t-2}\\ \varepsilon_{t-1}\\ \end{array}\right]+\left[\begin{array}{c} 1\\ 0\\ 1 \end{array} \right] \varepsilon_{t}.\]

Sometimes it more convenient to use the following form \[\underset{(3 \times1)}{\mathbf{Y}_{t}}=\underset{(3\times 3)}{\mathbf{\Phi}}\underset{(3\times1)}{\mathbf{Y}_{t-1}}+\underset{(3\times 3)}{\mathbf{C}} \underset{(3\times1)}{\mathbf{w}_t}\] where \(\mathbf{C} = \mbox{diag}[\sigma, 0, \sigma],\, \mathbb{E}[\mathbf{w}_t \mathbf{w}_t^{T}]= \mathbf{I}\) instead of using \(\mathbf{e}_t\).

Simulation of VAR

  • Multivariate white noise: The pair \(\{e_{1,t}, e_{2,t} \}\) are multivariate white noise if each individual series is white noise and the CCF for all \(k\neq 0\) \[\gamma_{e_1, e_2}(k)=\mbox{Cov}(e_{1,t}, e_{2,t+k}) = 0.\]

  • This argument can be extended to \(N\)-number of series \(\{e_{1,t}, \dots, e_{N,t}\}\).

  • It means the covariance of any two series in \(\{e_{1,t}, \dots, e_{N,t}\}\) at the same time \(t\) can be non-zero \(\mbox{Cov}(e_{i,t}, e_{j,t})\neq 0\).

  • An \(N\times N\) covariance matrix for the vector \(\mathbf{e}_t = [e_{1,t}, \dots, e_{N,t}]^T\).

Simulation of VAR

library(mvtnorm)
covmat = matrix(c(1,0.5,0.5,1) , nr=2)
e = rmvnorm(1000, sigma = covmat)
cov(e)
##           [,1]      [,2]
## [1,] 1.0451912 0.5690965
## [2,] 0.5690965 1.0639452

Simulation of VAR

e1 = e[,1]; e2 = e[,2]; ccf(e1,e2)

Simulation of VAR

  • Consider \[Y_{1, t}=0.4 Y_{1,t-1}+ 0.3 Y_{2, t-1}+ e_{1,t} \\ Y_{2, t}=0.2 Y_{1,t-1}+ 0.1 Y_{2, t-1}+ e_{2,t}.\]
y1 = y2 = rep(0, 1000); y1[1] = e1[1]; y2[1] = e2[1] 
for (t in 2:1000){
  y1[t] = 0.4*y1[t-1] + 0.3*y2[t-1] + e1[t]
  y2[t] = 0.2*y1[t-1] + 0.1*y2[t-1] + e2[t]
}

VAR

VAR(cbind(y1,y2), p=1, include.mean = FALSE)
## AR coefficient matrix 
## AR( 1 )-matrix 
##       [,1]  [,2]
## [1,] 0.382 0.332
## [2,] 0.205 0.129
## standard error 
##        [,1]   [,2]
## [1,] 0.0319 0.0372
## [2,] 0.0322 0.0375
##   
## Residuals cov-mtx: 
##           [,1]      [,2]
## [1,] 1.0445671 0.5692354
## [2,] 0.5692354 1.0654107
##   
## det(SSE) =  0.788864 
## AIC =  -0.2291614 
## BIC =  -0.2095304 
## HQ  =  -0.2217002

Summary

  • Cointegration of two series.

  • An empirical example.

  • An economic system with many interacting series.

  • A Matrix representation of these series.