Vector Autoregression (VAR) Model

Econ 5316 time Series Econometrics(Spring 2017) / Homework6 Problem1 Keunyoung (Kay) Kim

1. Introduction

This Vector Autoregression (VAR) model is used to forecast GDP growth rate. Using data from 1961 Q1 to 2016 Q4, we create a forecast for the period 2017 Q1 to 2017 Q4. For this, we estimate a bivariate reduced form VAR and run the Granger causality tests for both variables.And then we estimate a restricted VAR model in which we remove lags based on Granger causality test results.

2.Data

The quarterly time series for U.S. real GDP(FRED/GDPC96), GDP deflator(FRED/GDPDEF) and quarterly closing value of S&P 500 Index( YAHOO/INDEX_GSPC/CLOSE) are imported from Quandl website. Each time series shows trend. Therefore we need to take log differences.

Using three log difference time series, we construct the following two time series. \(y_{1,t} = 400 \Delta logGDP_t\), which approximates the annualized growth rate of the U.S. real GDP \(y_{2,t} = 400( \Delta log SP500_t - \Delta log p_t^{GDP} )\), which approximates the inflation adjusted annual return of S&P 500.

Methodology

We conduct the following analysis to create a forecast using VAR model.

(a)

we estimate a bivariate reduced form VAR for \(y_t = (y_{1,t}, y_{2,t})'\) for the period 1961Q1-2016Q4, and use information criteria to select number of lags.

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      2      2      2      2 
## 
## $criteria
##                  1           2           3           4           5
## AIC(n)    9.175989    9.091016    9.113119    9.131058    9.150883
## HQ(n)     9.213867    9.154146    9.201502    9.244693    9.289771
## SC(n)     9.269747    9.247279    9.331887    9.412331    9.494662
## FPE(n) 9662.353374 8875.343350 9073.962580 9238.681869 9424.421932
##                  6           7           8
## AIC(n)    9.173800    9.182950    9.207869
## HQ(n)     9.337940    9.372341    9.422513
## SC(n)     9.580084    9.651738    9.739163
## FPE(n) 9644.003179 9734.167942 9981.829987

From the results above, we choose 2 as a lag, p to minimize, AIC, HQ, SC, FPE. We construct multivariate order 2 VAR model, VAR(2).

The result below shows the estimated VAR (2). The correlation matrix of residuals in to bottom presents 0.1644 as the correlation of residuals. It is relatively low and implies \(y_{1}\) and \(y_{2}\) do not have much impact on each other.

For example, in decomposition for \(y_1\), fluctuations are likely to be entirely due to \(\varepsilon_{y1}\). In decomposition for \(y_2\), fluctuations are likely to be entirely due to \(\varepsilon_{y2}\).

## 
## VAR Estimation Results:
## ======================= 
## 
## Estimated coefficients for equation y1: 
## ======================================= 
## Call:
## y1 = y1.l1 + y2.l1 + y1.l2 + y2.l2 + const 
## 
##      y1.l1      y2.l1      y1.l2      y2.l2      const 
## 0.21647746 0.01613929 0.16893703 0.02402029 1.71887534 
## 
## 
## Estimated coefficients for equation y2: 
## ======================================= 
## Call:
## y2 = y1.l1 + y2.l1 + y1.l2 + y2.l2 + const 
## 
##       y1.l1       y2.l1       y1.l2       y2.l2       const 
##  0.60370307  0.10199918 -0.88416451 -0.07822811  3.84221501

## 
## VAR Estimation Results:
## ========================= 
## Endogenous variables: y1, y2 
## Deterministic variables: const 
## Sample size: 222 
## Log Likelihood: -1629.844 
## Roots of the characteristic polynomial:
## 0.3746 0.3243 0.257 0.257
## Call:
## VAR(y = y.Q, type = "const", lag.max = 9, ic = "AIC")
## 
## 
## Estimation results for equation y1: 
## =================================== 
## y1 = y1.l1 + y2.l1 + y1.l2 + y2.l2 + const 
## 
##       Estimate Std. Error t value Pr(>|t|)    
## y1.l1 0.216477   0.064678   3.347 0.000963 ***
## y2.l1 0.016139   0.006032   2.675 0.008032 ** 
## y1.l2 0.168937   0.063182   2.674 0.008070 ** 
## y2.l2 0.024020   0.006122   3.924 0.000117 ***
## const 1.718875   0.293197   5.863 1.68e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 2.886 on 217 degrees of freedom
## Multiple R-Squared: 0.2373,  Adjusted R-squared: 0.2232 
## F-statistic: 16.88 on 4 and 217 DF,  p-value: 4.602e-12 
## 
## 
## Estimation results for equation y2: 
## =================================== 
## y2 = y1.l1 + y2.l1 + y1.l2 + y2.l2 + const 
## 
##       Estimate Std. Error t value Pr(>|t|)
## y1.l1  0.60370    0.72774   0.830    0.408
## y2.l1  0.10200    0.06787   1.503    0.134
## y1.l2 -0.88416    0.71091  -1.244    0.215
## y2.l2 -0.07823    0.06888  -1.136    0.257
## const  3.84222    3.29900   1.165    0.245
## 
## 
## Residual standard error: 32.47 on 217 degrees of freedom
## Multiple R-Squared: 0.02376, Adjusted R-squared: 0.005766 
## F-statistic:  1.32 on 4 and 217 DF,  p-value: 0.2633 
## 
## 
## 
## Covariance matrix of residuals:
##        y1     y2
## y1  8.329   15.4
## y2 15.404 1054.4
## 
## Correlation matrix of residuals:
##        y1     y2
## y1 1.0000 0.1644
## y2 0.1644 1.0000

The following table shows the results of VAR(2) estimation

## 
## ===========================================================
##                                    Dependent variable:     
##                                ----------------------------
##                                      (1)           (2)     
## -----------------------------------------------------------
## y1.l1                             0.216***        0.604    
##                                    (0.065)       (0.728)   
##                                                            
## y2.l1                             0.016***        0.102    
##                                    (0.006)       (0.068)   
##                                                            
## y1.l2                             0.169***        -0.884   
##                                    (0.063)       (0.711)   
##                                                            
## y2.l2                             0.024***        -0.078   
##                                    (0.006)       (0.069)   
##                                                            
## const                             1.719***        3.842    
##                                    (0.293)       (3.299)   
##                                                            
## -----------------------------------------------------------
## Observations                         222           222     
## R2                                  0.237         0.024    
## Adjusted R2                         0.223         0.006    
## Residual Std. Error (df = 217)      2.886         32.472   
## F Statistic (df = 4; 217)         16.878***       1.320    
## ===========================================================
## Note:                           *p<0.1; **p<0.05; ***p<0.01

(b)

Now we run the Granger causality tests for both variables.

## $Granger
## 
##  Granger causality H0: y1 do not Granger-cause y2
## 
## data:  VAR object varp
## F-Test = 0.87077, df1 = 2, df2 = 434, p-value = 0.4194
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: y1 and y2
## 
## data:  VAR object varp
## Chi-squared = 5.8402, df = 1, p-value = 0.01566

## $Granger
## 
##  Granger causality H0: y2 do not Granger-cause y1
## 
## data:  VAR object varp
## F-Test = 12.1, df1 = 2, df2 = 434, p-value = 7.696e-06
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: y2 and y1
## 
## data:  VAR object varp
## Chi-squared = 5.8402, df = 1, p-value = 0.01566

The first result has large p-value and it indicates that we cannot reject null hypothesis: y1 do not Granger-cause y2. In other words, GDP growth rate does not Granger cause adjusted annual return. It implies past GDP growth does not information to predict today’s annual return.

The second result has sufficiently small p-value and it indicates that we can reject null hypothesis: y2 do not Granger-cause y1. In other words, adjusted annual return does not Granger cause GDP growth. It implies past annual return has information to predict today’s GDP growth.

(c)

In this analysis, we estimate a restricted VAR model in which we remove lags based on Granger causality test from (b). we eliminate lags of y1 from the equation for y2. The results below show a restricted VAR(2) estimation and a forecast based on this model

##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    1    1    1    1
## [2,]    0    1    0    1    1

## 
## VAR Estimation Results:
## ======================= 
## 
## Estimated coefficients for equation y1: 
## ======================================= 
## Call:
## y1 = y1.l1 + y2.l1 + y1.l2 + y2.l2 + const 
## 
##      y1.l1      y2.l1      y1.l2      y2.l2      const 
## 0.21647746 0.01613929 0.16893703 0.02402029 1.71887534 
## 
## 
## Estimated coefficients for equation y2: 
## ======================================= 
## Call:
## y2 = y2.l1 + y2.l2 + const 
## 
##       y2.l1       y2.l2       const 
##  0.10770248 -0.07695147  2.97268872

## 
## VAR Estimation Results:
## ========================= 
## Endogenous variables: y1, y2 
## Deterministic variables: const 
## Sample size: 222 
## Log Likelihood: -1630.756 
## Roots of the characteristic polynomial:
## 0.5333 0.3168 0.2774 0.2774
## Call:
## VAR(y = y.Q, type = "const", lag.max = 9, ic = "AIC")
## 
## 
## Estimation results for equation y1: 
## =================================== 
## y1 = y1.l1 + y2.l1 + y1.l2 + y2.l2 + const 
## 
##       Estimate Std. Error t value Pr(>|t|)    
## y1.l1 0.216477   0.064678   3.347 0.000963 ***
## y2.l1 0.016139   0.006032   2.675 0.008032 ** 
## y1.l2 0.168937   0.063182   2.674 0.008070 ** 
## y2.l2 0.024020   0.006122   3.924 0.000117 ***
## const 1.718875   0.293197   5.863 1.68e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 2.886 on 217 degrees of freedom
## Multiple R-Squared: 0.5877,  Adjusted R-squared: 0.5782 
## F-statistic: 61.86 on 5 and 217 DF,  p-value: < 2.2e-16 
## 
## 
## Estimation results for equation y2: 
## =================================== 
## y2 = y2.l1 + y2.l2 + const 
## 
##       Estimate Std. Error t value Pr(>|t|)
## y2.l1  0.10770    0.06736   1.599    0.111
## y2.l2 -0.07695    0.06713  -1.146    0.253
## const  2.97269    2.19560   1.354    0.177
## 
## 
## Residual standard error: 32.45 on 219 degrees of freedom
## Multiple R-Squared: 0.02454, Adjusted R-squared: 0.01118 
## F-statistic: 1.836 on 3 and 219 DF,  p-value: 0.1414 
## 
## 
## 
## Covariance matrix of residuals:
##        y1     y2
## y1  8.329   15.4
## y2 15.404 1062.9
## 
## Correlation matrix of residuals:
##        y1     y2
## y1 1.0000 0.1637
## y2 0.1637 1.0000

##    y1.l1 y2.l1 y1.l2 y2.l2 const
## y1     1     1     1     1     1
## y2     0     1     0     1     1

## [[1]]
##        y1.l1      y2.l1
## y1 0.2164775 0.01613929
## y2 0.0000000 0.10770248
## 
## [[2]]
##       y1.l2       y2.l2
## y1 0.168937  0.02402029
## y2 0.000000 -0.07695147

## $y1
##           fcst     lower    upper       CI
##  [1,] 3.200518 -2.455837 8.856874 5.656356
##  [2,] 3.053308 -2.859581 8.966197 5.912889
##  [3,] 3.027710 -3.360753 9.416173 6.388463
##  [4,] 3.019132 -3.432701 9.470964 6.451833
##  [5,] 3.005252 -3.462087 9.472592 6.467340
##  [6,] 3.001342 -3.469075 9.471760 6.470417
##  [7,] 2.999215 -3.471687 9.470117 6.470902
##  [8,] 2.998301 -3.472689 9.469290 6.470990
##  [9,] 2.997955 -3.473049 9.468958 6.471003
## [10,] 2.997804 -3.473201 9.468810 6.471005
## [11,] 2.997746 -3.473259 9.468752 6.471006
## [12,] 2.997723 -3.473282 9.468729 6.471006
## 
## $y2
##           fcst     lower    upper       CI
##  [1,] 2.217931 -61.42638 65.86224 63.64431
##  [2,] 3.339744 -60.78270 67.46219 64.12244
##  [3,] 3.022868 -61.36210 67.40784 64.38497
##  [4,] 3.017495 -61.38011 67.41510 64.39760
##  [5,] 3.059189 -61.35099 67.46936 64.41017
##  [6,] 3.063068 -61.34956 67.47569 64.41263
##  [7,] 3.070113 -61.34293 67.48316 64.41305
##  [8,] 3.072701 -61.34044 67.48584 64.41314
##  [9,] 3.073743 -61.33941 67.48690 64.41315
## [10,] 3.074246 -61.33891 67.48740 64.41316
## [11,] 3.074431 -61.33872 67.48759 64.41316
## [12,] 3.074509 -61.33865 67.48766 64.41316

(d)

Now, We plot IRFs and FEVD for the VAR model based on Choleski decomposition.

IRFs for VAR(2) model of GDP growth rate and Adjusted annual return show that \(y_1\) increases only very little in response to \(\varepsilon_{y2}\). \(y_2\) increase more in response to \(\varepsilon_{y1}\) but this increase is short lived and converges back within about 3 quarters.

##             y1        y2
## [1,] 1.0000000 0.0000000
## [2,] 0.8842881 0.1157119
## [3,] 0.8812763 0.1187237
## [4,] 0.8812737 0.1187263

##              y1        y2
## [1,] 0.02701807 0.9729819
## [2,] 0.03644858 0.9635514
## [3,] 0.03661892 0.9633811
## [4,] 0.03661909 0.9633809

In decomposition for \(y_1\), at 1 quarter horizon fluctuations are entirely due to \(\varepsilon_{y1}\) and at 40 quarters horizon more than 88% are due to \(\varepsilon_{y1}\) and less than 1.2% are due to \(\varepsilon_{y2}\). In decomposition for \(y_2\), at 1 quarter horizon fluctuations are entirely due to \(\varepsilon_{y2}\) and at 40 quarters horizon more than 96% are due to \(\varepsilon_{y2}\) and less than 3.7% are due to \(\varepsilon_{y1}\).

After we reverse the order of the variables in the VAR and plot IRFs and FEVD based on Choleski decomposition for the alternative order, relative ratio of fluctuation in FEVD look similar. However, IRFs shows different pattern.

(e)

Now, we re-estimate out VAR, with a third variable, Leading Index for the United States, FRED/USSLIND. For this, we construct additional R-code and estimate VAR model. Please refer to eco5316sp2017_HW6P1_e.Rmd for R code and eco5316sp2017_HW6P1_e.pdf for the result.

(f)

From the result of “eco5316sp2017_HW6P1_e.R” file, our forecast for real GDP growth rate in 2017 Q1 is 2.99%. This is higher than the Federal Bank of New York, 2.6 % and the Federal Bank of Atlanta, 0.5%. When it is compared to the forecasts in WSJ survey, it is higher than average and lower than maximum.

The Federal Bank of New York Nowcast : 2.6%
The GDPNow Federal Bank of Atlanta forecast : 0.5%
The forecasts in the Wall Street Journal Economic Forecasting Survey the minimum: 0.6%, the average: 1.4% the maximum: 3.4%

Before we add the third variable, our forecast for real GDP growth rate in 2017 Q1 is 3.2% and is higher than the forecast with a third variable.

4.Conclusions

VAR(2) model of GDP growth rate and inflation adjusted annual return show that past annual return has information to predict today’s GDP growth and predict the GDP growth rate in 2017 1Q as 2.99% that is relatively higher than the forecast of Federal Banks of NY and Atlanta.

* R code for analysis

# obtain data on GDP, Deflator and S&P 500 index
library(Quandl)
Quandl.api_key("XzdSwkDsE98Mxj3ixQzG")

rgdp.q <- Quandl("FRED/GDPC96", type="zoo")
def.q <- Quandl("FRED/GDPDEF", type="zoo")
sp.q <- Quandl("YAHOO/INDEX_GSPC/CLOSE", collapse="quarterly",type="zoo")

par(mfrow=c(3,3))
plot(rgdp.q, xlab="", ylab="", main="Real GDP", col="blue")
plot(def.q, xlab="", ylab="", main="GDP Deflator", col="red")
plot(sp.q, xlab="", ylab="", main="S&P 500 index", col="red")

plot(log(rgdp.q), xlab="", ylab="", main="Real GDP (log)", col="blue")
plot(log(def.q), xlab="", ylab="", main="GDP Deflator", col="red")
plot(log(sp.q), xlab="", ylab="", main="S&P 500 index", col="red")

# log change in house price index
dl.rgdp <- diff(log(rgdp.q))
dl.def <- diff(log(def.q))
dl.sp <- diff(log(sp.q))

plot(dl.rgdp, xlab="", ylab="", main="Real GDP (Diff log)", col="blue")
plot(dl.def, xlab="", ylab="", main="GDP Deflator", col="red")
plot(dl.sp, xlab="", ylab="", main="S&P 500 index", col="red")


y1 <- 400*dl.rgdp
y2 <- 400*(dl.sp - dl.def)

par(mfrow=c(1,1))
plot(y1, xlab="", ylab="", main="Growth Rate of Real GDP vs. S&P 500 return", col="blue", ylim=c(-140, 80))
legend("topright",c("GDP Growth Rate", "Adjusted Return of S&P 500"), bty = "n", col = c(4,2), lty = c(1,2))
lines(y2, col="red", lty="dashed")


# form dataset for VAR model
y.Q <- cbind(y1, y2)
y.Q <- window(y.Q, start="1961 Q1", end="2016 Q4")


# load package that allows to estimate and analyze VAR models
# install.packages("vars")
library(vars)


#Q.a
# selection criteria summary
VARselect(y.Q, lag.max=8, type="const")

# estimate a reduced form VAR(2) : Matrix A <- based on SC criteria
# var1$varresult$y1$coefficients[1]
varp <- VAR(y.Q, ic="AIC", lag.max=9, type="const")
varp
summary(varp)


# using stargazer package to report results of VAR estimation
lmp <- varp$varresult


library(stargazer)
stargazer(lmp$y1, lmp$y2, type="text", dep.var.labels.include=FALSE)


# plot residuals and their ACF and PACF
#plot(varp)
#plot(varp, names="y1")

# QQ plot for residuals
#str(varp)
#e.y1 <- varp$varresult$y1$residuals
#qqnorm(e.y1)
#qqline(e.y1)

# multivariate Jarque-Bera test
#normality.test(varp)


# Q.b
# Granger causality
causality(varp, cause="y1")
causality(varp, cause="y2")



# Q.c
# estimate restricted VAR - based on Granger causality test eliminate lags of y2 from the equation for  y1
# define a  matrix with restictions
mat.r <- matrix(1, nrow=2, ncol=5)
mat.r[2, c(1,3)] <- 0
mat.r
varp.r <- restrict(varp, method="manual", resmat=mat.r)
varp.r
summary(varp.r)
varp.r$restrictions
Acoef(varp.r)

# estimate restricted VAR - keep only variables with t-value larger than 2.0
#varp.r.ser <- restrict(varp, method="ser", thresh=2.0)
#varp.r.ser
#summary(varp.r.ser)
#varp.r.ser$restrictions
#Acoef(varp.r.ser)


# forecasting
varp.f <- predict(varp, n.ahead=12)
plot(varp.f)
fanchart(varp.f)

# Q.d

# IRFs - based on Choleski decomposition of var(e)
varp.irfs <- irf(varp, n.ahead=20)
par(mfcol=c(2,2), cex=0.6)
plot(varp.irfs, plot.type="single")

# FEVD - based on Choleski decomposition of var(e)
varp.fevd <- fevd(varp, n.ahead=40)
varp.fevd[[1]][c(1,4,8,40),]
varp.fevd[[2]][c(1,4,8,40),]
plot(varp.fevd)
plot(varp.fevd, addbars=8)


# note: ordering of variables matters for IRF and FEVD

# ordering 1: y1 before y2
y.Q.ord1 <- cbind(y1, y2)
y.Q.ord1 <- window(y.Q.ord1, start="1961 Q1", end="2016 Q4")

# ordering 2: y2 before y1
y.Q.ord2 <- cbind(y2, y1)
y.Q.ord2 <- window(y.Q.ord2, start="1961 Q1", end="2016 Q4")

# reduced form VAR(1)
var1.ord1 <- VAR(y.Q.ord1, p=2, type="const")
var1.ord2 <- VAR(y.Q.ord2, p=2, type="const")

# IRF based on Choleski decomposition of var(e)
var1.irfs.ord1 <- irf(var1.ord1, n.ahead=20)
var1.irfs.ord2 <- irf(var1.ord2, n.ahead=20)
par(mfcol=c(2,2), cex=0.6)
plot(var1.irfs.ord1, plot.type="single")
plot(var1.irfs.ord2, plot.type="single")

# FEVD based on Choleski decomposition of var(e)
var1.fevd.ord1 <- fevd(var1.ord1, n.ahead=20)
var1.fevd.ord2 <- fevd(var1.ord2, n.ahead=20)
plot(var1.fevd.ord1)
plot(var1.fevd.ord2)