Vector Error Correction Model (VEC) Model

Econ 5316 time Series Econometrics(Spring 2017) / Homework7 Problem1
Keunyoung (Kay) Kim

1. Introduction

This assignment is to estimate the VEC model, to forecast the price of the gas using VEC model, and compare the predictability of VEC model with those of forecasts using AR/MA/ARMA model.

2.Data

We construct the VEC model by using two monthly time series.

Crude Oil Prices: West Texas Intermediate (WTI), FRED/MCOILWTICO
US Regular Conventional Gas Price, FRED/GASREGCOVM.

3.Methodology

We assume that crude oil prices and gas prices are related to each other through a long-run equilibrium. Based on this assumption, we conduct cointegration test and estimate the VEC model. Finally, we forecast future gas price using one-month ahead rolling scheme. \[ \]

(a) Create a single time series plot with two log prices \(log p_{t}^{GAS}\) and \(log p_{t}^{OIL}\) for the sample 1995M1-2017M4.

The red line is the gas price and the blue line is the oil (WTI) price. The oil price is higher than the gas prices, but the movement of them is very similar.

(b) Perform unit root tests to determine whether \(log p_{t}^{GAS}\) and \(log p_{t}^{OIL}\) are I(0) or I(1).

The results below show that we can reject the null of no unit root. There we need to test for first difference data. \[ \]

## 
## ############################################### 
## # Elliot, Rothenberg and Stock Unit Root Test # 
## ############################################### 
## 
## Test of type P-test 
## detrending of series with intercept and trend 
## 
## Value of test-statistic is: 9.6438 
## 
## Critical values of P-test are:
##                 1pct 5pct 10pct
## critical values 3.96 5.62  6.89

## 
## ############################################### 
## # Elliot, Rothenberg and Stock Unit Root Test # 
## ############################################### 
## 
## Test of type P-test 
## detrending of series with intercept and trend 
## 
## Value of test-statistic is: 8.0956 
## 
## Critical values of P-test are:
##                 1pct 5pct 10pct
## critical values 3.96 5.62  6.89

\[ \]

The results for the first difference time series show that we cannot reject null of no unit root. Based on this result, we can say that both time series are I(1).

## 
## ############################################### 
## # Elliot, Rothenberg and Stock Unit Root Test # 
## ############################################### 
## 
## Test of type P-test 
## detrending of series with intercept and trend 
## 
## Value of test-statistic is: 0.8329 
## 
## Critical values of P-test are:
##                 1pct 5pct 10pct
## critical values 3.96 5.62  6.89

## 
## ############################################### 
## # Elliot, Rothenberg and Stock Unit Root Test # 
## ############################################### 
## 
## Test of type P-test 
## detrending of series with intercept and trend 
## 
## Value of test-statistic is: 0.4831 
## 
## Critical values of P-test are:
##                 1pct 5pct 10pct
## critical values 3.96 5.62  6.89

(c) Determine the number of lags to include in cointegration analysis using Schwarz information criterion. Run the Johansen’s trace and maximum eigenvalue cointegration tests for ( \(log p_{t}^{GAS}\) and \(log p_{t}^{OIL}\)) using the sample 1995M1-2010M12. Use time series plots from (a) as a guide to determine the specification of the deterministic components in the cointegration test (i.e. whether to use Case 2, Case 3, of Case 4 cointegration test).

Based on the result of VARselect, we can determine the number of lags to be included in cointegration test and in VEC model as three. The results of Johansen’s Trace test show that we can reject the null hypothesis. Therefore we can conclude \(y_t\) (oil price and gas price) are cointegrated.

The results of maximum eigenvalue test also show that we can reject the null hypothesis. We can get same conclusion with above.

When we see the time series plots from (a), it seems to be Case 4(restricted trend), because it shows the similar trend and the sizable gap between each time series. But, the trend is not pronounced.

## SC(n) 
##     3

## 
## ###################### 
## # Johansen-Procedure # 
## ###################### 
## 
## Test type: trace statistic , without linear trend and constant in cointegration 
## 
## Eigenvalues (lambda):
## [1] 1.624149e-01 1.223479e-02 5.284233e-18
## 
## Values of teststatistic and critical values of test:
## 
##           test 10pct  5pct  1pct
## r <= 1 |  2.33  7.52  9.24 12.97
## r = 0  | 35.82 17.85 19.96 24.60
## 
## Eigenvectors, normalised to first column:
## (These are the cointegration relations)
## 
##            log.wti.l1  log.gas.l1   constant
## log.wti.l1   1.000000  1.00000000  1.0000000
## log.gas.l1  -1.584166 -0.06721038  0.5915855
## constant    -2.754478 -3.99184158 -2.8496192
## 
## Weights W:
## (This is the loading matrix)
## 
##           log.wti.l1   log.gas.l1      constant
## log.wti.d  0.0183103 -0.012570961 -1.291780e-17
## log.gas.d  0.2109392 -0.005473235 -3.689353e-16

## 
## ###################### 
## # Johansen-Procedure # 
## ###################### 
## 
## Test type: maximal eigenvalue statistic (lambda max) , without linear trend and constant in cointegration 
## 
## Eigenvalues (lambda):
## [1] 1.624149e-01 1.223479e-02 5.284233e-18
## 
## Values of teststatistic and critical values of test:
## 
##           test 10pct  5pct  1pct
## r <= 1 |  2.33  7.52  9.24 12.97
## r = 0  | 33.50 13.75 15.67 20.20
## 
## Eigenvectors, normalised to first column:
## (These are the cointegration relations)
## 
##            log.wti.l1  log.gas.l1   constant
## log.wti.l1   1.000000  1.00000000  1.0000000
## log.gas.l1  -1.584166 -0.06721038  0.5915855
## constant    -2.754478 -3.99184158 -2.8496192
## 
## Weights W:
## (This is the loading matrix)
## 
##           log.wti.l1   log.gas.l1      constant
## log.wti.d  0.0183103 -0.012570961 -1.291780e-17
## log.gas.d  0.2109392 -0.005473235 -3.689353e-16

\[ \]

On the other hand, the result of lttest which tests for the presence of the constant term in the cointegration relationship suggests case 2(restricted constant)

## LR-test for no linear trend
## 
## H0: H*2(r<=1)
## H1: H2(r<=1)
## 
## Test statistic is distributed as chi-square
## with 1 degress of freedom
##         test statistic p-value
## LR test           0.91    0.34

\[ \]

(d) Based on your results in (c), use the 1995M1-2010M12 sample to estimate a bivariate VEC model for \(log p_{t}^{GAS}\) and \(log p_{t}^{OIL}\).

## $rlm
## 
## Call:
## lm(formula = substitute(form1), data = data.mat)
## 
## Coefficients:
##              log.wti.d  log.gas.d
## ect1          0.01831    0.21094 
## log.wti.dl1   0.27532    0.13558 
## log.gas.dl1  -0.13892    0.35368 
## log.wti.dl2   0.17025    0.01702 
## log.gas.dl2  -0.05737   -0.14318 
## 
## 
## $beta
##                 ect1
## log.wti.l1  1.000000
## log.gas.l1 -1.584166
## constant   -2.754478

The cointegrating vector is estimated \(\beta\) = (1.000000,-1.584166, -2.754478 ). We have included a constant in the cointegrating relationship.

Adjustment parameters are estimated as \(\alpha\) = (0.01831, 0.21094). \[ \]

(e) Are the adjustment parameters \(\alpha_1\) and \(\alpha_2\) in the estimated VEC model statistically significant? Are their signs consistent with error correction mechanism that moves the system back to the long run equilibrium, whenever there is a disruption and \(z_{t-1}\) is not zero?

\(\alpha_1\) is not significant but \(\alpha_2\) is statistically significant at 0.001 level. Adjustment parameters \(\alpha_1\) and \(\alpha_2\) determine the speed of return to long run equilibrium. For long run relationship to be stable, we need \(\alpha_1\) < 0 or =0, \(\alpha_2\) > 0 or =0 and at least one of them can not be equal 0. In this result, \(\alpha_1\) >0 and \(\alpha_2\) >0, so it does not satisfy the condition for long run stable relationship.

## Response log.wti.d :
## 
## Call:
## lm(formula = log.wti.d ~ ect1 + log.wti.dl1 + log.gas.dl1 + log.wti.dl2 + 
##     log.gas.dl2 - 1, data = data.mat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.26410 -0.04813  0.01579  0.06504  0.23386 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## ect1         0.01831    0.07813   0.234   0.8150  
## log.wti.dl1  0.27532    0.10870   2.533   0.0121 *
## log.gas.dl1 -0.13892    0.15197  -0.914   0.3619  
## log.wti.dl2  0.17025    0.10848   1.569   0.1183  
## log.gas.dl2 -0.05737    0.14514  -0.395   0.6931  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.08399 on 184 degrees of freedom
## Multiple R-squared:  0.07845,    Adjusted R-squared:  0.0534 
## F-statistic: 3.133 on 5 and 184 DF,  p-value: 0.009728
## 
## 
## Response log.gas.d :
## 
## Call:
## lm(formula = log.gas.d ~ ect1 + log.wti.dl1 + log.gas.dl1 + log.wti.dl2 + 
##     log.gas.dl2 - 1, data = data.mat)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.210243 -0.024651  0.003275  0.032126  0.129029 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## ect1         0.21094    0.04902   4.304 2.73e-05 ***
## log.wti.dl1  0.13558    0.06819   1.988 0.048261 *  
## log.gas.dl1  0.35368    0.09534   3.710 0.000275 ***
## log.wti.dl2  0.01702    0.06806   0.250 0.802782    
## log.gas.dl2 -0.14318    0.09105  -1.572 0.117572    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05269 on 184 degrees of freedom
## Multiple R-squared:  0.3709, Adjusted R-squared:  0.3538 
## F-statistic: 21.69 on 5 and 184 DF,  p-value: < 2.2e-16

\[ \]

(f) Reestimate the VEC model with a restriction \(\alpha_2 = 0\).

## 
## ###################### 
## # Johansen-Procedure # 
## ###################### 
## 
## Estimation and testing under linear restrictions on beta 
## 
## The VECM has been estimated subject to: 
## beta=H*phi and/or alpha=A*psi
## 
##      [,1]
## [1,]    1
## [2,]    0
## 
## Eigenvalues of restricted VAR (lambda):
## [1] 0.084 0.000 0.000
## 
## The value of the likelihood ratio test statistic:
## 16.91 distributed as chi square with 1 df.
## The p-value of the test statistic is: 0 
## 
## Eigenvectors, normalised to first column
## of the restricted VAR:
## 
##                  [,1]
## RK.log.wti.l1  1.0000
## RK.log.gas.l1 -1.5423
## RK.constant   -2.7886
## 
## Weights W of the restricted VAR:
## 
##         [,1]
## [1,] -0.2291
## [2,]  0.0000

\[ \]

(g) What is the intuition for imposing the restriction in (f)? Hint: Think about how under this restriction \(log p_{t}^{GAS}\) and \(log p_{t}^{OIL}\) will change over time in response to a disruption such that \(z_{t-1} > 0\)

If \(\alpha_1\) >0 and \(\alpha_2\) <0 then \(y_1\) would be growing and \(y_2\) would be declining, taking the system even further away from the long run equilibrium. So we need to restrict one of them to be zero. if \(\alpha_2\) =0 and \(\alpha_1\) is not zero, then \(y_{2,t}\), gas price is a pure random walk and all the adjustment occurs in \(y_{1,t}\), oil price.
\[ \]

(h) Create and plot sequence of one month ahead forecasts for the period 2011M1-2017M4. Obtain the RMSE for the forecast of the gas price, \(p_{t}^{GAS}\)

The plot below shows the forecast using multistep scheme.

The plot below shows the forecast using one-month ahead rolling scheme. RMSE is 0.04930924.

(i) Create a correlogram for \(\Delta log p_{t}^{GAS}\). Use it to identify a suitable AR or MA or ARMA model. Estimate

Based on the ACF and PACF, we determine a ARMA(2,6) as our model for forecasting.

(j) Create a sequence of one month ahead forecasts for \(p_{t}^{GAS}\) for the period 2011M1:2017M4 using the model from (i). Obtain the RMSE for the forecast of the gas price, \(p_{t}^{GAS}\)

The plot below shows the forecast of \(\Delta logp_{t}^{GAS}\). RMSE is 0.05357051.

(k) Compare the RMSE of the forecast for \(p_{t}^{GAS}\) from (h) based on the VEC model, and the forecast for \(p_{t}^{GAS}\) from (j) based on the AR/MA/ARMA model

The RMSE based on the VEC model is 0.04930924 and it is smaller than those based on ARMA(2,6), 0.05357051.

##                  ME       RMSE        MAE       MPE     MAPE       ACF1
## Test set -0.0141565 0.04930924 0.03547289 -1.667412 3.922441 0.05211672
##          Theil's U
## Test set 0.5858428

##                    ME       RMSE        MAE MPE MAPE
## Test set -0.009307094 0.05357051 0.04075292 Inf  Inf

\[ \]

(l) Based on your results in (k), is there any advantage to using the VEC model to forecast the price of the gas, instead of using a simple univariate AR/MA/ARMA model?

When the economic variables are related to each other through a long-run equiulibriumm, this pushes the variables toward this equilibrieum if there is a temporary deviation from it. VEC model includes the error correction term and reduce the deviation from the equilibrium. When variables are cointegrated, we can get a better result with smaller RMSE using a VEC model.

4.Conclusions

The forecast using the VEC model has lower RMSE, compared to RMSE using ARMA(2,6). The VEC model is used to estimate the economic variables related to each other through a long-run equiulibriumm. In this assignment, oil price and gas price show that they are related to each other, that is, cointegrated. Therefore, when we use the VEC model, the predictability of forecasting model is better than those of ARMA(2,6).

* R code for analysis

# remove everything from the environment
rm(list=ls())

library(Quandl)
Quandl.api_key("XzdSwkDsE98Mxj3ixQzG")

oil <- Quandl("FRED/MCOILWTICO", type="zoo")
gas <- Quandl("FRED/GASREGCOVM", type="zoo")

loil <- log(oil)
lgas <- log(gas)

# Q.a
loil <- window(loil, start=1995+0, end=2017+3/12)
lgas <- window(lgas, start=1995+0, end=2017+3/12)

par(mfrow=c(1,1))
plot(loil, type='l', main="WTI vs. Gas", col="blue", ylim=c(-1, 5))
lines(lgas, col="red")


#Q.b : ERS unit root test
library(urca)

loil.urers1 <- ur.ers(loil, type="P-test", model="trend")
summary(loil.urers1)
lgas.urers2 <- ur.ers(lgas, type="P-test", model="trend")
summary(lgas.urers2)

# reject Null of no unit root -> exist unit root -> difference
doil <- diff(loil)
dgas <- diff(lgas)

doil.urers1 <- ur.ers(doil, type="P-test", model="trend")
summary(doil.urers1)
dgas.urers2 <- ur.ers(dgas, type="P-test", model="trend")
summary(dgas.urers2)

par(mfrow=c(1,1))
plot(doil, type='l', main="(log difference)WTI vs. (log difference)Gas", col="blue", ylim=c(-0.4, 0.4))
lines(dgas, col="red")

#Q.c
library(vars)
y <- cbind(loil, lgas)
colnames(y) <- c("log.wti","log.gas")
y <- na.trim(y)

# determine number of lags to be included in cointegration test and in VEC model
y.VAR.IC <- VARselect(y, type="const")
nlags <- y.VAR.IC$selection["SC(n)"]
nlags

y <- window(y, start=1995+0, end=2010+11/12)

# perform cointegration test
library(urca)
y.CA <- ca.jo(y, ecdet="const", type="trace", K=nlags, spec="transitory")
summary(y.CA)
y.CA <- ca.jo(y, ecdet="const", type="eigen", K=nlags, spec="transitory")
summary(y.CA)

lttest(y.CA, r=1)

# Q.d
# estimate unrestriced VEC model
y.VEC <- cajorls(y.CA, r=1)
y.VEC

#Q.E
# to see t-statistics and p-values
summary(y.VEC$rlm)


#Q.F

# test for restricted adjustment parameters alpha
rest.alpha <- matrix(c(1,0), c(2,1))
y.CA.ralpha <- alrtest(y.CA, A=rest.alpha, r=1)
summary(y.CA.ralpha)

# Q.G

# vec2varX supports restricted VEC models
source("vec2varX.r")
y.VAR <- vec2varX(y.CA, r=1)
y.VAR.rest <- vec2varX(y.CA, A=rest.alpha, r=1)

# forecast using VAR in levels
y.VAR.fcst <- predict(y.VAR.rest, n.ahead=76, ci=0.95)
par( mar=c(4,4,2,1), cex=0.75)
plot(y.VAR.fcst)


#Q.I
first.m <- 1995+0
last.m <-2010+11/12
dgas.p1 <- window(dgas, end=last.m)
dgas.p2 <- window(dgas, start=last.m+1/12)

library(forecast)
par(mfrow=c(1,2))
Acf(dgas.p1, type="correlation", lag=48, main="ACF for Gas")
Acf(dgas.p1, type="partial", lag=48, main="PACF for Gas")

arma26 <- Arima(dgas.p1, order=c(2, 0, 6))
tsdiag(arma26, gof.lag=36)

#Q.J
rol.f <-zoo()
for(i in 1: length(dgas.p2))
{
  y <- window(dgas, end = last.m+(i-1)/12)
  rol.updt <- arima(y, order=c(2,0,6))
  rol.f <- c(rol.f, forecast(rol.updt, 1)$mean)
}  
rol.f <-as.ts(rol.f)
par(mfrow=c(1,1))
plot(rol.f, type="o", pch=16, xlim=c(1995, 2017), ylim=c(-0.4, 0.2), main="One month ahead Forecasts for log-diff(Gas)")
lines(rol.f, type="p", pch=16, lty="solid", col="red")
lines(dgas, type="l", pch=16, lty="dashed")
lines(dgas.p1, type="o", pch=16, lty="solid")


#Q.H 
# one month ahead forecast

yall <- cbind(loil, lgas)
colnames(yall) <- c("log.wti","log.gas")
yall <- na.trim(yall)

yall <- window(yall, start=1995+0, end=2017+3/12)
first.m <- 1995+0
last.m <-2010+11/12

y.p1 <- window(yall, end=last.m)
y.p2 <- window(yall, start=last.m+ 1/12)


y.VAR.f1 <-data.frame()
y.VAR.f2 <-data.frame()

for(i in 1:length(y.p2[,2]))
{
  y <- window(yall, end = last.m + (i-1)/12)
  y.CA <- ca.jo(y, ecdet="const", type="eigen", K=3, spec="transitory")
  y.VAR <- vec2var(y.CA, r=1)
  y.VAR.updt <- predict(y.VAR, n.ahead=1)
  y.VAR.f1 <-rbind(y.VAR.f1, as.ts(y.VAR.updt$fcst$log.wti))
  y.VAR.f2 <-rbind(y.VAR.f2, as.ts(y.VAR.updt$fcst$log.gas))
  
} 

#y.VAR.f2 <- as.ts(y.VAR.f2) 
#fix that by specifying the time index for the forecast
#change y.VAR.f2 into monthly series that starts in 2011m1

y.VAR.f1 <- ts(y.VAR.f1, start=2011, frequency=12)
y.VAR.f2 <- ts(y.VAR.f2, start=2011, frequency=12)


y.gas <-as.ts(yall[,2])
y.gas.p1 <-as.ts(y.p1[,2])
y.gas.p2 <-as.ts(y.p2[,2])

par(mfrow=c(1,1))
plot(y.gas, type="l", pch=16, lty="dashed", main="One month ahead Forecasts for Gas")
lines(y.gas.p1, type="o", pch=16, lty="solid")
lines(y.VAR.f2[,1], type="o", pch=16, lty="solid", col="red")


#Q.k
accuracy(y.gas.p2, y.VAR.f2[,1])
accuracy(rol.f, dgas.p2)