Vector Autoregressions (VAR)

What is VAR?

The vector autoregression (VAR) model is an extension of univariate autoregressive model to multivariate autoregressive model. It is one of the most widely used models to describe the dynamic interactions between varies variables in the system.

VAR Representations

Suppose you want to study the interactions between the GDP (\( y_1 \))and money supply (\( y_2 \)), and you beleive that they have contemporaneous effect to each other and their relationship can be modelled as:

Structural Form:

\[ y_{1,t} = a_1 - b_{12}y_{2,t} + c_{11}y_{1,t-1} + c_{12}y_{2,t-1} + \epsilon_{1,t} \] \[ y_{2,t} = a_2 - b_{21}y_{1,t} + c_{21}y_{1,t-1} + c_{22}y_{2,t-1} + \epsilon_{2,t} \]

The above 2 equations are in structural form as the equation contains contemporaneous value of other variables (coefficients \( b_{12} \) and \( b_{21} \))as regressors. The error terms in each regression are uncorrelated (\( cov(\epsilon_{1,t}, \epsilon_{2,t}) = 0 \)), the source of causing a shock in GDP should not related to the money supply's one.

The above 2 equations can be written in matrix form:

\[ \left[ \begin{array}{cc}1 & b_{12} \\ b_{21} & 1 \end{array} \right] \left[ \begin{array}{c}y_{1,t} \\ y_{2,t} \end{array} \right] = \left[ \begin{array}{c}a_1 \\ a_2 \end{array} \right] + \left[ \begin{array}{cc}c_{11} & c_{12} \\ c_{21} & c_{22} \end{array} \right] \left[ \begin{array}{c}y_{1,t-1} \\ y_{2,t-1} \end{array} \right] + \left[ \begin{array}{c}\epsilon_{1,t} \\ \epsilon_{2,t} \end{array} \right] \]

\[ By_t = a + C y_{t-1} + \epsilon_t \]

What do the coefficients mean?

The coefficients \( b_{12} \) and \( b_{21} \) represent the interactions between \( y_1 \) and \( y_2 \), if \( b_{12} \neq 0 \), it means that \( y_2 \) has a contemporaneous effect on \( y_1 \). The same applies for \( b_{12} \).

Now, for a shock in GDP (\( \epsilon_{1,t} \)), it will have an effect on \( y_{1,t} \) and sequently affect \( y_{2,t} \), thus \( \epsilon_{1,t} \) is correlated to \( y_{2,t} \). Due to this relation, the two equations above can NOT be estimated by OLS as it violated one of the Gauss-Markov assumptions - The error term and regressors are independent. (Please refer to my previous post on OLS)

Rewrite the equations in Reduced Form

The problem of the structural form VAR is it cannot be estimated by OLS, thus, you may tempted to rewrite it as reduced form VAR that can be estimated by OLS:

Assumming the matrix \( B = \left[ \begin{array}{cc}1 & b_{12} \\ b_{21} & 1 \end{array} \right] \) can be inverted: \( B^{-1} = \frac{1}{\Delta}\left[\begin{array}{cc}1 & -b_{12} \\ -b_{21} & 1 \end{array}\right] \), \( \Delta = 1-b_{12}b_{21} \)

\[ By_t = a + C y_{t-1} + \epsilon_t \] \[ y_t = B^{-1}a + B^{-1} C y_{t-1} + B^{-1}\epsilon_t \]

\[ \left[ \begin{array}{c}y_{1,t} \\ y_{2,t} \end{array} \right] = \left[ \begin{array}{cc}1 & b_{12} \\ b_{21} & 1 \end{array} \right]^{-1} \left[ \begin{array}{c}a_1 \\ a_2 \end{array} \right] + \left[ \begin{array}{cc}1 & b_{12} \\ b_{21} & 1 \end{array} \right]^{-1} \left[ \begin{array}{cc}c_{11} & c_{12} \\ c_{21} & c_{22} \end{array} \right] \left[ \begin{array}{c}y_{1,t-1} \\ y_{2,t-1} \end{array} \right] + \left[ \begin{array}{cc}1 & b_{12} \\ b_{21} & 1 \end{array} \right]^{-1} \left[ \begin{array}{c}\epsilon_{1,t} \\ \epsilon_{2,t} \end{array} \right] \]

We let \( B^{-1}a = \rho \), \( B^{-1}C = \gamma \), \( B^{-1}\epsilon_t = e \), The equation becomes:

\[ y_t = B^{-1}a + B^{-1} C y_{t-1} + B^{-1}\epsilon_t \] \[ y_t = \gamma+ \Gamma y_{t-1} + e_t \]

\[ \left[ \begin{array}{c}y_{1,t} \\ y_{2,t} \end{array} \right] = \left[ \begin{array}{c}\gamma_1 \\ \gamma_2 \end{array} \right] + \left[ \begin{array}{cc}\gamma_{11} & \gamma_{12} \\ \gamma_{21} & \gamma_{22} \end{array} \right] \left[ \begin{array}{c}y_{1,t-1} \\ y_{2,t-1} \end{array} \right] + \left[ \begin{array}{c}e_{1,t} \\ e_{2,t} \end{array} \right] \]

The equation now is in reduced form as the independent variables can be expressed in their own lags.

The residual terms \( e_t \) are correlated to each other as orginally, the covariances of error terms (\( \epsilon_t \))in structural form are: \[ Var(\epsilon_t) = E(\epsilon_t \epsilon_t') = \left[\begin{array}{cc}\sigma_{\epsilon_1}^2 & 0 \\ 0 & \sigma_{\epsilon_2}^2 \end{array} \right] = D \]

Since \( B^{-1}\epsilon_t = e_t \), the covariances of error terms (\( e_t \)) in reduced form are: \[ Var(e_t) = E(e_t e_t') = E(B^{-1}\epsilon_t (B^{-1}\epsilon_t)') = E(B^{-1} \epsilon_t \epsilon_t' B^{-1'} ) = B^{-1} D B^{-1'} = \Sigma \]

While the elements in \( \Sigma \) are : \( \left[ \begin{array}{cc}\sigma_{e_{11}}^2 & \sigma_{e_{12}}^2 \\ \sigma_{e_{21}}^2 & \sigma_{e_{22}}^2 \end{array}\right] \) = \( \frac{1}{\Delta^2} \left[\begin{array}{cc}\sigma_{\epsilon_1}^2+b_{12}^2\sigma_{\epsilon_2}^2 & -(b_{21}\sigma_{\epsilon_1}^2 + b_{12}\sigma_{\epsilon_2}^2) \\ -(b_{21}\sigma_{\epsilon_1}^2 + b_{12}\sigma_{\epsilon_2}^2) & \sigma_{\epsilon_2}^2+b_{21}^2\sigma_{\epsilon_1}^2 \end{array}\right] \)

The covariance between \( e_1 \) & \( e_2 \) will be zero when both \( b_{12} \) and \( b_{21} \) are zero, and if that's the case, it also implies that \( y_1 \) and \( y_2 \) do not have contemporaneous effect to each other.

Stationarity of VAR

In lag operator notation, the reduced form VAR model can be writted as: \[ y_t = \gamma+ \Gamma y_{t-1} + e_t \] \[ y_t - \Gamma y_{t-1} = \gamma + e_t \] \[ (1-\Gamma L) y_t = \gamma + e_t \] \[ \Gamma(L)y_t = \gamma + e_t \]

For stationarity, the formula should be able to express in moving average (MA) or Wold representation, that means the function \( \Gamma(L) \) should invertible. If \( \Gamma(L) \) is invertible, this means that characteristic roots of \( det(I - \Gamma z) = 0 \) lie outside the unit circle or equivalently, or the eigen values of the companion matrix F have modulues less than one:

\[ F = \left[\begin{array}{cccc} \Gamma_1 & \Gamma_2 & \cdots & \Gamma_n \\ I_n & 0 & \cdots & 0 \\ 0 & \ddots & 0 & \vdots \\ 0 & 0 & I_n & 0 \end{array} \right] \]

In our case, the companion matrix \( F \) for \( n=1 \) is:

\[ F = \Gamma \]

Estimation of VAR

So now the problem is, how can we estimate the structural VAR model? The steps are:

  1. Estimate the reduced form VAR model first.
  2. Find the Covariance matrix of the residuals of the estimated equations in 1.
  3. Apply Choleski decomposition for the covariance matrix to obtain matrix \( B^{-1} \).
  4. Multiply the matrix \( B \) to the reduced form VAR model to get the structural form back.

The True model is:

\[ y_t = \gamma+ \Gamma y_{t-1} + e_t \] \[ y_t = B^{-1}a + B^{-1} C y_{t-1} + B^{-1}\epsilon_t \]

If we can estimate the matrix \( B \), then we can multiple it to the above equation to retreive the TRUE model back. But in practical, it is no way the estimate the matrix \( B \) back by NOT making any restriction. Why? Let's consider the following method:

\[ \left[ \begin{array}{c}y_{1,t} \\ y_{2,t} \end{array} \right] = \left[ \begin{array}{c}\gamma_1 \\ \gamma_2 \end{array} \right] + \left[ \begin{array}{cc}\gamma_{11} & \gamma_{12} \\ \gamma_{21} & \gamma_{22} \end{array} \right] \left[ \begin{array}{c}y_{1,t-1} \\ y_{2,t-1} \end{array} \right] + \left[ \begin{array}{c}e_{1,t} \\ e_{2,t} \end{array} \right] \]

We can estimated the equations by OLS to obtain:

\[ y_t = \hat{\gamma}+ \hat{\Gamma} y_{t-1} + \hat{e_t} \]

The covariance matrix of the residuals is: \[ E(\hat{e_t} \hat{e_t}') = \hat{\Sigma} \]

Next, we perform Choleski decomposition for the matrix \( \hat{\Sigma} \) \[ \hat{\Sigma} = L \Lambda L' \]

As mentioned before, the True covariance matrix is \( \Sigma = B^{-1}DB^{-1'} \), and the elements of \( \Sigma \) are:

\[ \left[ \begin{array}{cc}\sigma_{e_{11}}^2 & \sigma_{e_{12}}^2 \\ \sigma_{e_{21}}^2 & \sigma_{e_{22}}^2 \end{array}\right] = \frac{1}{\Delta^2} \left[\begin{array}{cc}\sigma_{\epsilon_1}^2+b_{12}^2\sigma_{\epsilon_2}^2 & -(b_{21}\sigma_{\epsilon_1}^2 + b_{12}\sigma_{\epsilon_2}^2) \\ -(b_{21}\sigma_{\epsilon_1}^2 + b_{12}\sigma_{\epsilon_2}^2) & \sigma_{\epsilon_2}^2+b_{21}^2\sigma_{\epsilon_1}^2 \end{array}\right] \]

The problem we encounter now is, even we got the true convariance matrix \( \Sigma \), we cannot estimate the 4 parameters (\( b_{12}, b_{21}, \sigma_{\epsilon_1}, \sigma_{\epsilon_2} \)) by only has 3 values (\( \sigma_{e11} \), \( \sigma_{e22} \), \( \sigma_{e12}=\sigma_{e21} \)) observated. That implies that, some restrictions must be imposed in order to make estimation. The most commonly used restriction is to set \( b_{12} = 0 \), which assumes that \( y_2 \) has no contemporaneous effect on \( y_1 \) but \( y_1 \) do has contemporaneous effect on \( y_2 \). When making this assumption, you should focus on the economic meaning rather than the statistical meaning of the equation.

Try it out! Generating the data

To see whether how the estimation works, let's get our hands dirty now. We generate the data \( y_1 \) and \( y_2 \) by the following bivariate SVAR(1) model:

a1 <- 1.6
b12 <- 0
c11 <- 0.5
c12 <- 0.8

a2 <- -0.8
b21 <- 0.5
c21 <- 0.8
c22 <- 0.3

\[ y_{1,t} = 1.6 - 0y_{2,t} + 0.5y_{1,t-1} + 0.8y_{2,t-1} + \epsilon_{1,t} \] \[ y_{2,t} = -0.8 - 0.5y_{1,t} + 0.8y_{1,t-1} + 0.3y_{2,t-1} + \epsilon_{2,t} \]

\[ \left[ \begin{array}{cc}1 & 0 \\ 0.5 & 1 \end{array} \right] \left[ \begin{array}{c}y_{1,t} \\ y_{2,t} \end{array} \right] = \left[ \begin{array}{c}1.6 \\ -0.8 \end{array} \right] + \left[ \begin{array}{cc}0.5 & 0.8 \\ 0.8 & 0.3 \end{array} \right] \left[ \begin{array}{c}y_{1,t-1} \\ y_{2,t-1} \end{array} \right] + \left[ \begin{array}{c}\epsilon_{1,t} \\ \epsilon_{2,t} \end{array} \right] \]

*As we will impose the restriction of \( b_{12} =0 \) in the estimation process, the parameter \( b_{12} \) is set to zero in the data generating process.

For the data generation process of \( y_1 \) and \( y_2 \), we first did it by rewriting the structural form equations through subsitution.

\[ y_{1,t} = a_1 - b_{12}y_{2,t} + c_{11}y_{1,t-1} + c_{12}y_{2,t-1} + \epsilon_{1,t} \] \[ y_{2,t} = a_2 - b_{21}y_{1,t} + c_{21}y_{1,t-1} + c_{22}y_{2,t-1} + \epsilon_{2,t} \]

Let's subsitute \( y_{2,t} \) to \( y_{1,t} \):

\[ y_{1,t} = a_1 - a_2 b_{12} + b_{12}b_{21}y_{1,t} + (c_{11}-b_{12}c_{21})y_{1,t-1}+(c_{12}-b_{12}c_{22})y_{2,t-1} - b_{12}\epsilon_{2,t}+\epsilon_{1,t} \]

\[ y_{1,t} = \frac{1}{1-b_{12}b_{21}} (a_1 - a_2 b_{12} + (c_{11}-b_{12}c_{21})y_{1,t-1}+(c_{12}-b_{12}c_{22})y_{2,t-1} -b_{12}\epsilon_{2,t}+\epsilon_{1,t}) \]

One we have \( y_{1,t} \), we can subsitute \( y_{1,t} \) back to get \( y_{2,t} \).

Let's generate the data in R.


# generate 1,000 independent suprise terms in y1 and y2
set.seed(13)
y1.innov <- rnorm(1000, 0, 1)
y2.innov <- rnorm(1000, 0, 1)

y1.data <- rep(0, 1000)
y2.data <- rep(0, 1000)
y1.data[1] <- 0
y2.data[1] <- 0

for (i in 2:1000) {
    y1.data[i] <- (1/(1 - b12 * b21)) * (a1 - a2 * b12 + (c11 - b12 * c21) * 
        y1.data[i - 1] + (c12 - b12 * c22) * y2.data[i - 1] - b12 * y2.innov[i] + 
        y1.innov[i])

    y2.data[i] <- a2 - b21 * y1.data[i] + c21 * y1.data[i - 1] + c22 * y2.data[i - 
        1] + y2.innov[i]
}

Let's plot the last 100 observations of the 2 series :

library(ggplot2)
library(reshape2)

x <- seq(901:1000)
df <- data.frame(x, y1 = y1.data[901:1000], y2 = y2.data[901:1000])

df.melted <- melt(df, id.vars = "x")

g <- ggplot(data = df.melted, aes(x = x, y = value, color = variable)) + geom_line()

g

plot of chunk unnamed-chunk-3

Ok, now we have the data \( y1 \) and \( y2 \) and they seem correlated to each other, let's try out the estimation procedure and see whether we can retrieve the structural form or not.

Estimation of VAR in Brute Force

As mentioned above, the estimation procedures of VAR are:

  1. Estimate the reduced form VAR model first.
  2. Find the Covariance matrix of the residuals of the estimated equations in 1.
  3. Apply Choleski decomposition for the covariance matrix to obtain matrix \( L \), which is the estimator of \( B^{-1} \).
  4. Multiply the matrix $L{-1} to the estimated reduced form VAR model to get the structural form back.

Setp 1 - Estimated the reduced form of VAR model

So, let's do the step 1

n <- 1000

t <- 2:n

df <- data.frame(y1 = y1.data, y2 = y2.data)
# reduced form
y1.r.fit <- lm(y1[t] ~ y1[t - 1] + y2[t - 1], df)
y2.r.fit <- lm(y2[t] ~ y1[t - 1] + y2[t - 1], df)

Let's check the regression results:

y1.r.fit
## 
## Call:
## lm(formula = y1[t] ~ y1[t - 1] + y2[t - 1], data = df)
## 
## Coefficients:
## (Intercept)    y1[t - 1]    y2[t - 1]  
##       1.592        0.505        0.777
y2.r.fit
## 
## Call:
## lm(formula = y2[t] ~ y1[t - 1] + y2[t - 1], data = df)
## 
## Coefficients:
## (Intercept)    y1[t - 1]    y2[t - 1]  
##      -1.575        0.546       -0.110

And the matrices in reduced form are:

gamma1 <- y1.r.fit$coef[1]
gamma2 <- y2.r.fit$coef[1]

gamma.hat <- matrix(c(gamma1, gamma2), ncol = 1, nrow = 2, byrow = F)

gamma11 <- y1.r.fit$coef[2]
gamma12 <- y1.r.fit$coef[3]
gamma21 <- y2.r.fit$coef[2]
gamma22 <- y2.r.fit$coef[3]

Gamma.hat <- matrix(c(gamma11, gamma12, gamma21, gamma22), ncol = 2, byrow = T)

Now, let's check the stationarity of the series. For VAR(1) be stationary, the eigenvalues of the matrix \( \Gamma \) must in the unit circle

eigen(Gamma.hat, only.values = T)$values
## [1]  0.9179 -0.5228

Step 2 - get the covariance matrix \( \hat{\Sigma} \) of the estimated residuals:

res.mat <- cbind(y1.resid = y1.r.fit$residuals, y2.resid = y2.r.fit$residuals)
Sigma.hat <- var(res.mat)
colnames(Sigma.hat) <- NULL
rownames(Sigma.hat) <- NULL
Sigma.hat
##         [,1]    [,2]
## [1,]  1.0003 -0.5039
## [2,] -0.5039  1.2332

Step 3 - Do the Choleski decomposition to obrain the matrix \( L \), which is the estimator of \( B^{-1} \).

ch <- chol(Sigma.hat)
dd <- diag(ch)
L <- t(ch/dd)
D <- diag(dd^2)

Linv = solve(L)

Step 4 - Multiply \( L^{-1} \) to get back the structural form

a.hat = Linv %*% gamma.hat
C.hat = Linv %*% Gamma.hat

Now, Let's compare the True value with the estimated value

\( B=\left[ \begin{array}{cc}1 & 0 \\ 0.5 & 1 \end{array} \right] \), \( \hat{B} = \left[ \begin{array}{cc}1 & 0 \\ 0.5037 & 1 \end{array} \right] \)

\( a=\left[ \begin{array}{c}1.6 \\ -0.8 \end{array} \right] \), \( \hat{a}=\left[ \begin{array}{c}1.5921 \\ -0.7733 \end{array} \right] \),

\( C=\left[ \begin{array}{cc}0.5 & 0.8 \\ 0.8 & 0.3 \end{array} \right] \), \( \hat{C}=\left[ \begin{array}{cc}0.5047 & 0.7768 \\ 0.8007 & 0.2817 \end{array} \right] \)

Check the covariance matrix of \( \hat{\epsilon} \) in the structual form, there should not be any covariance between the 2 shocks.

epsilon.hat = list()
for (i in 1:1000) {
    epsilon.hat[[i]] = Linv %*% as.matrix(c(y1.r.fit$residuals[i], y2.r.fit$residuals[i]))
}

e.mat = matrix(0, nrow = 999, ncol = 2)
for (i in 1:999) {
    e.mat[i, 1] = epsilon.hat[[i]][1]
    e.mat[i, 2] = epsilon.hat[[i]][2]
}
cov(e.mat)
##           [,1]       [,2]
## [1,]  1.00e+00 -5.540e-17
## [2,] -5.54e-17  9.794e-01

Everything seems pretty good. Now, let's talk about how to make analysis by using the VAR model.

Moving Average (MA) or Wold Representation

At the time of discussing stationarity, I mentioned that the reduced form VAR model can be rewrite to wold representation, now consider a VAR(1) model

\[ y_t = \gamma+ \Gamma y_{t-1} + e_t \] \[ y_t - \Gamma y_{t-1} = \gamma + e_t \] \[ (1-\Gamma L) y_t = \gamma + e_t \] \[ y_t = \frac{1}{(1-\Gamma L)} \gamma + \frac{1}{(1-\Gamma L)} e_t \] \[ y_t = \phi + \Phi(L)e_t \]

where: \[ \Phi(L) = (1-\Gamma L)^{-1} = \sum_{i=0}^{\infty} \Gamma^i L^i \]

So, \[ y_t = \phi + e_t + \Gamma^1 e_{t-1} + \Gamma^2 e_{t-2} + \cdots \]

Since the vector \( e_{t} \) is correlated, the above equation does not give any insight. But, recall that \( B^{-1}\epsilon_t = e_t \), it can be further written as:

\[ y_t = \phi + B^{-1}\epsilon_t + \Gamma^1 B^{-1}\epsilon_{t-1} + \Gamma^2 B^{-1}\epsilon_{t-2} + \cdots \] \[ y_t = \phi + \Upsilon(L)\epsilon_t \]

where

\[ \Upsilon(L) = \sum_{k=0}^{\infty} \Upsilon_k L^k \] \[ \Upsilon(L) = B^{-1} + \Gamma^1 B^{-1} L + \Gamma^2 B^{-1} L^2 + \cdots \]

Since \( \epsilon_{t-i} \) is the structural innovations, the above equation is call the structural moving average (SMA) form.

Let's have a look at the SMA representation for the bivariate system:

\[ \left[\begin{array}{c} y_{1,t} \\ y_{2,t} \end{array}\right] = \left[\begin{array}{c} \phi_1 \\ \phi_2 \end{array}\right] + \left[\begin{array}{cc} \upsilon_{11}^{(0)} & \upsilon_{12}^{(0)}\\ \upsilon_{21}^{(0)} & \upsilon_{22}^{(0)} \end{array}\right] \left[\begin{array}{c} \epsilon_{1,t} \\ \epsilon_{2,t} \end{array}\right] + \left[\begin{array}{cc} \upsilon_{11}^{(1)} & \upsilon_{12}^{(1)}\\ \upsilon_{21}^{(1)} & \upsilon_{22}^{(1)} \end{array}\right] \left[\begin{array}{c} \epsilon_{1,t-1} \\ \epsilon_{2,t-1} \end{array}\right] + \cdots \]

The coefficient \( \upsilon_{ij}^{(k)} \) represent the dynamic multipler or impulse responses of \( y_{i,t} \) to changes in \( \epsilon_{j,t} \)

The Impulse Response Function (IRF)

Consider the SMA representation of the equation at time \( t+s \):

\[ \left[\begin{array}{c} y_{1,t+s} \\ y_{2,t+s} \end{array}\right] = \left[\begin{array}{c} \phi_1 \\ \phi_2 \end{array}\right] + \left[\begin{array}{cc} \upsilon_{11}^{(0)} & \upsilon_{12}^{(0)}\\ \upsilon_{21}^{(0)} & \upsilon_{22}^{(0)} \end{array}\right] \left[\begin{array}{c} \epsilon_{1,t+s} \\ \epsilon_{2,t+s} \end{array}\right] + \cdots + \left[\begin{array}{cc} \upsilon_{11}^{(s)} & \upsilon_{12}^{(s)}\\ \upsilon_{21}^{(s)} & \upsilon_{22}^{(s)} \end{array}\right] \left[\begin{array}{c} \epsilon_{1,t} \\ \epsilon_{2,t} \end{array}\right] \]

If we differentiate the above equation with respect to indidivual structural innvoation, we can obtain the structural dynamic multipliers:

\[ \begin{array}{cc} \frac{\delta y_{1,t+s}}{\delta \epsilon_{1,t}} = \upsilon_{11}^{(s)} & \frac{\delta y_{1,t+s}}{\delta \epsilon_{2,t}} = \upsilon_{12}^{(s)} \\ \frac{\delta y_{2,t+s}}{\delta \epsilon_{1,t}} = \upsilon_{21}^{(s)} & \frac{\delta y_{2,t+s}}{\delta \epsilon_{2,t}} = \upsilon_{22}^{(s)} \end{array} \]

And if the 2 series \( y_t \) are covariance stationary:

\[ lim_{s\rightarrow \infty} \upsilon_{ij}^{(s)} = 0, \text{for }i, j \in \{1,2\} \]

Thus, the long run cumulative impact of the structural innovations is captured by the long-run impact matrix:

\[ \Upsilon(L) = \left[\begin{array}{cc} \sum_{s=0}^{\infty}\upsilon_{11}^{(s)}L^s & \sum_{s=0}^{\infty}\upsilon_{12}^{(s)} L^s\\ \sum_{s=0}^{\infty}\upsilon_{21}^{(s)} L^s & \sum_{s=0}^{\infty}\upsilon_{22}^{(s)} L^s \end{array}\right] \]

Now, let's calculate and plot the 4 cummulative IPFs in R up to 100 lags.

nLags = 100
Upsilon.list <- list()

for (i in 1:nLags) {

    tempGamma <- Gamma.hat
    if (i == 1) 
        tempGamma <- diag(2) else if (i > 2) {
        for (j in 3:i) {
            tempGamma <- tempGamma %*% Gamma.hat
        }
    }

    Upsilon.list[[i]] <- L %*% tempGamma
}

Culsum.Upsilon.list <- Upsilon.list
for (i in 1:nLags) {
    j <- 1
    while (j < i) {
        Culsum.Upsilon.list[[i]] <- Culsum.Upsilon.list[[i]] + Upsilon.list[[j]]
        j <- j + 1
    }
}

upsilon11 = array()
upsilon12 = array()
upsilon21 = array()
upsilon22 = array()

for (i in 1:nLags) {
    upsilon11[i] = Culsum.Upsilon.list[[i]][1, 1]
    upsilon12[i] = Culsum.Upsilon.list[[i]][1, 2]
    upsilon21[i] = Culsum.Upsilon.list[[i]][2, 1]
    upsilon22[i] = Culsum.Upsilon.list[[i]][2, 2]
}

par(mfrow = c(2, 2))
# tex = 'Shock in '+expression(epsilon_(t))
plot(upsilon11, type = "l", xlab = "Time", ylab = expression(paste("Shock to ", 
    epsilon[1])), main = "Response of Y1")
plot(upsilon12, type = "l", xlab = "Time", ylab = expression(paste("Shock to ", 
    epsilon[2])), main = "Response of Y2")
plot(upsilon21, type = "l", xlab = "Time", ylab = expression(paste("Shock to ", 
    epsilon[1])), main = "Response of Y1")
plot(upsilon22, type = "l", xlab = "Time", ylab = expression(paste("Shock to ", 
    epsilon[2])), main = "Response of Y2")

plot of chunk unnamed-chunk-13

IMpulse responses tell you the response of the current and fture values of the variables to a unit increase in the current value of VAR structural innovations.

Forecast Error Variance Decompositions

Forecast error variance decompositions help us to determine the proportion of the variability of the errors due to the structural shocks of \( \epsilon_1 \) and \( \epsilon_2 \) in forecasting \( y_1 \) and \( y_2 \) at time \( t+s \), based on the information at time \( t \).

Again, the Wold representation of \( y_{t+s} \) is:

\[ y_t = \phi + B^{-1}\epsilon_t + \Gamma^1 B^{-1}\epsilon_{t-1} + \Gamma^2 B^{-1}\epsilon_{t-2} + \cdots \] \[ y_{t+s} = \phi + \Upsilon_0\epsilon_{t+s} + \Upsilon_1\epsilon_{t+s-1} + \Upsilon_2\epsilon_{t+s-2} + \cdots +\Upsilon_s\epsilon_{t} + \Upsilon_{s+1}\epsilon_{t-1} + \cdots \]

Based on the information available at time \( t \), the best linear forecast of \( y_{t+s} \) is:

\[ \hat{y_{t+s}} = \phi + \Upsilon_s\epsilon_{t} + \Upsilon_{s+1}\epsilon_{t-1} + \Upsilon_{s+2}\epsilon_{t-2} + \cdots \]

The forecast error is:

\[ y_{t+s} -\hat{y_{t+s}} = \Upsilon_0\epsilon_{t+s} + \Upsilon_1\epsilon_{t+s-1} + \Upsilon_2\epsilon_{t+s-2} + \cdots +\Upsilon_{s-1}\epsilon_{t+1} \]

Thus, the forecast error by equations is:

\[ \left[\begin{array}{c} y_{1,t+s} - \hat{y_{1,t+s}} \\ y_{2,t+s} - \hat{y_{2,t+s}} \end{array}\right] = \left[\begin{array}{cc} \upsilon_{11}^{(0)} & \upsilon_{12}^{(0)}\\ \upsilon_{21}^{(0)} & \upsilon_{22}^{(0)} \end{array}\right] \left[\begin{array}{c} \epsilon_{1,t+s} \\ \epsilon_{2,t+s} \end{array}\right] + \cdots + \left[\begin{array}{cc} \upsilon_{11}^{(s-1)} & \upsilon_{12}^{(s-1)}\\ \upsilon_{21}^{(s-1)} & \upsilon_{22}^{(s-1)} \end{array}\right] \left[\begin{array}{c} \epsilon_{1,t+1} \\ \epsilon_{2,t+1} \end{array}\right] \]

For the first equation \[ y_{1,t+s} - \hat{y_{1,t+s}} = \begin{array}{l} \upsilon_{11}^{(0)}\epsilon_{1,t+s} + \cdots + \upsilon_{11}^{(s-1)}\epsilon_{1,t+1} \\ + \upsilon_{21}^{(0)}\epsilon_{2,t+s} + \cdots + \upsilon_{21}^{(s-1)}\epsilon_{2,t+1} \end{array} \]

\[ \sigma(s)_{1}^{2} = Var(y_{1,t+s} - \hat{y_{1,t+s}}) = \sigma_{\epsilon_1}^2( (\upsilon_{11}^{(0)})^2 + (\upsilon_{11}^{(1)})^2 + \dots + (\upsilon_{11}^{(s-1)})^2) + \sigma_{\epsilon_2}^2( (\upsilon_{21}^{(0)})^2 + (\upsilon_{21}^{(1)})^2 + \dots + (\upsilon_{21}^{(s-1)})^2) \]

Thus, the proportion of the total variance of forecasting series \( y_1 \) \( s \) period later due to shock in \( \epsilon_1 \) is:

\[ \frac{\sigma_{\epsilon_1}^2( (\upsilon_{11}^{(0)})^2 + (\upsilon_{11}^{(1)})^2 + \dots + (\upsilon_{11}^{(s-1)})^2)}{\sigma(s)_{1}^2} \]

The same idea applies for series \( y_2 \), shock \( \epsilon_2 \) as well.

My next post will talk about the VAR package available in R, stay tuned.