Introduction

A major issue in time series analysis is non-stationary data or data that has a unit root. Non-stationary means that the moments (mean, standard deviation, skew, kurtosis and serial correlation) are stable across different samples. For a unit root the equation:

\[y_t = \rho y_{t-1} + \varepsilon\]

\(\rho\) is estimated as unit or one. Any shock (from \(\varepsilon\)) will be permanent and y will wander in an unconstrained manner. There is no reason why the mean (or any other descriptive statistics that we calculate for y) will be stable over different samples. For example,

set.seed(124)
ry <- rnorm(100)
y <- cumsum(1 + ry)
plot(y, type = 'l', main = "Random walk")
abline(v = 50, lty = 2)
amean <- round(mean(y[1:50]),0)
bmean <- round(mean(y[51:100]),0)
text(20, 90, labels = "A", cex = 1.2)
text(80, 90, labels = "B", cex = 1.2)
text(25, 40, labels = paste("Mean part A equals ", amean))
text(75, 40, labels = paste("Mean part B equals ", bmean))

This makes it likely that you have what is called spurious regression.

Spurious regression

The major problem with non-stationary data is that it will cause spurious regression. Recall that the equation for correlation is:

\[corr_{y,x} = \frac{\sum_{t = 1}^{t = 100} (y_t - \bar{y}) \times \sum_{t = 1}^{t = 100} (x_t - \bar{x})}{Var(y)Var(x)}\]

Imagine two series with a unit root like diagram above, they are sure to have a positive correlation (not matter what they represent). Make sure that you are clear why this is the case.

Returns

We usually deal with this by calculating returns (or sometimes taking the first difference) so that we have stationary data. We can see this from a plot of ry. The ry series are just random numbers with a mean of zero and a standard deviation of 1.

plot(ry, type = 'l', main = "Plot of differenced unit-root series")
abline(h = 0, col = 'red', lty = 2)

Cointegration

Dealing only with returns, first difference or stationary data is usually fine, but there are times when there may be interesting information in the levels of the series. Engle and Granger (1987) analysed US income and consumption data. It is clear from the figure below that neither income nor consumption are stationary. However, they must be related to each other as the level of consumption depends on the level of income. If we just difference the data, we throw this information away. As the level of consumption must be related to the level of income we say that they are cointegrated. Cointegration means that when two or more non-stationary series are combined, they become stationary.

da <- read.csv('../../Data/PICon.csv')
da$DATE <- as.Date(da$DATE, format = "%Y-%m-%d")
das <- subset(da, da$DATE >= as.Date("2000-01-01"))
plot(das$DATE, das$PCE, type = 'l', main = "US personal income and consumption",
     xlab = 'Date', ylab = 'USD', lty = 2, ylim = c(min(das$PCE, das$PI),
                                                    max(das$PCE,das$PI))) 
lines(da$DATE, da$PI, type = 'l', col = 'red')
legend('topleft', inset = 0.01, legend = c('Consumption', 'Income'),
       col = c('black', 'red'), lty = c(2,1))
\label{fig:PICon}

In most cases where we look at cointegrated series we will assess the long-run relationship and the deviation from that relationship. For example, if spending is consistently above its long-term relationship to income, there is a risk that future spending will be lower in order to return to the long-run equilibrium. The deviation from the long-run or equilibrium relationship and its subsequence return to equilibrium is call the error-correction. There are two standard ways to look at conintegration and error-correction problems:

The Engle-Granger method

If there is cointegration between two series, the residuals from the regression of the series should be stationary. Applying the Engle-Granger method involves three steps:

  1. Run a regression of the two variables

  2. Test whether the residuals from the regression have a unit root

  3. If there is no unit root in the residuals, the regression provides the long-run relationship and the residuals are the deviation from that relationship. Therefore, we can estimate the speed at which the error or deviation from equilibrium is corrected

The first regression gives the long-run equilibrium

\[y_t = \delta_0 + \delta_1 x_t + u_t\]

so that

\[u_t = y_t - \hat \delta_0 -\hat \delta_1 x_t\] is the error-correction term

This will give the following error correction model:

\[\Delta y_t = \phi_0 + \sum_{j = 1}\phi_j \Delta y_{t - j} + \sum_{h = 0} \theta_h \Delta x_{t-h} + \alpha \hat{u}_{t-1} + \varepsilon_t\] where \(\Delta y_t\) is the dependent variable (Personal spending in this case), \(\phi_0\) is the intercept, constant or trend, while the other \(\phi\)s are coefficients on lagged values of the dependent variable, \(\theta\) are coefficients on the current and lagged values of the independent variable (personal income here), \(\alpha\) is the coefficient on the deviation of from long-term equilibrium (which must be negative), \(\hat{u}_{t-1}\) is the lagged deviation from the long-term equilibrium.

Engle-Granger is a rather crude and simplistic method with some limitations. The main limitations are determining which variable is the dependent variable, particularly if you have more than two series, and running two estimations will increase the risk of modelling noise.

URCA and VARS

There are two R packages that can be used for testing unit roots and cointegration.

  • URCA has a number of functions for testing for unit roots. These include the Augmented Dickey-Fuller tests. The URCA package is on CRAN

  • VARS has functions for cointegration and other time series techniques. The VARS package is on CRAN

You can find out more about using these packages on the Bernhard Pfaff site. This is a vignette that walks through the VARS package.

summary(eq1 <- lm(log(da$PCE) ~ log(da$PI)))
## 
## Call:
## lm(formula = log(da$PCE) ~ log(da$PI))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.30186 -0.01387  0.00054  0.01679  0.05979 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.3576418  0.0069563  -51.41   <2e-16 ***
## log(da$PI)   1.0131451  0.0008306 1219.80   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02866 on 780 degrees of freedom
## Multiple R-squared:  0.9995, Adjusted R-squared:  0.9995 
## F-statistic: 1.488e+06 on 1 and 780 DF,  p-value: < 2.2e-16
plot(da$DATE[da$DATE <= as.Date('2020-01-01')], eq1$residuals[da$DATE <= as.Date('2020-01-01')], type = 'l', xlab = 'Date', ylab = 'Residuals')

Cointegration

We can use the urca package to find unit root tests. The documentation will discuss a number of different options. The Augmented Dicky-Fuller Test (ADF) is the most popular. This is based on testing the null that \(\rho\) is equal to one in the following equation:

\[y_t = \rho y_{t-1} + \varepsilon_t\] The usual test is then

\[\Delta y_t = (\rho - 1)y_{t-1} + \varepsilon_t\] or

\[\Delta y_y = \delta y_{t-1} + \varepsilon_t\]

where \(\delta\) is equal to \(\rho - 1\) so that if \(\delta = 0, \rho = 1\)

In practice, it is usual to add lags of the dependent variable \((y)\) to remove serial correlation and to test three forms:

  1. Test for a unit root (as above)

\[\Delta y_t = \delta y_{t-1} + \varepsilon_t\]

  1. Test for unit root with constant (mean does not equal zero)

\[\Delta y_t = \alpha_0 + \delta y_{t-1} + \varepsilon_t\]

  1. Test for a unit root and time trend

\[\Delta y_t = \alpha_0 + \alpha_1 t + \delta y_{t-1} + \varepsilon_t\] The critical values are not standard and change for each version of the test. From looking at the residuals, it appears that we are dealing with no drift and no time trend.

Test the random walk

require(urca)
summary(ur.df(eq1$residuals))
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.252646 -0.003360  0.000370  0.003931  0.094149 
## 
## Coefficients:
##            Estimate Std. Error t value Pr(>|t|)    
## z.lag.1    -0.09859    0.01770  -5.570 3.51e-08 ***
## z.diff.lag -0.25396    0.03464  -7.332 5.71e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.01369 on 778 degrees of freedom
## Multiple R-squared:  0.1266, Adjusted R-squared:  0.1244 
## F-statistic:  56.4 on 2 and 778 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -5.57 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62

We can reject the null of a unit root in the residuals. This suggests that there is cointegration between US Personal Income and Consumption.

Cointegration

The error correction model can be specified.

# Create the percentage change in income and consumption
da$DPI <- c(NA, da$PI[2:length(da$PI)]/da$PI[1:length(da$PI) - 1] - 1)
da$DPCE <- c(NA, da$PCE[2:length(da$PCE)]/da$PCE[1:length(da$PCE) - 1] - 1)
# Create the lagged long run equation
da$LLRE <- c(NA, eq1$residuals[1:length(eq1$residuals) -1])
# Create the lagged changes in income and consumption
da$LDPI <- c(NA, da$DPI[2:length(da$DPI) -1])
da$LDPCE <- c(NA, da$DPCE[2:length(da$DPCE) -1])
# Now run the regression
head(da)
##         DATE    PI   PCE         DPI         DPCE       LLRE        LDPI
## 1 1959-01-01 391.8 306.1          NA           NA         NA          NA
## 2 1959-02-01 393.7 309.6 0.004849413  0.011434172 0.03231622          NA
## 3 1959-03-01 396.5 312.7 0.007112014  0.010012920 0.03878423 0.004849413
## 4 1959-04-01 399.9 312.2 0.008575032 -0.001598977 0.04156735 0.007112014
## 5 1959-05-01 402.4 316.1 0.006251563  0.012491992 0.03131638 0.008575032
## 6 1959-06-01 404.8 318.2 0.005964215  0.006643467 0.03741697 0.006251563
##          LDPCE
## 1           NA
## 2           NA
## 3  0.011434172
## 4  0.010012920
## 5 -0.001598977
## 6  0.012491992
head(eq1$residuals)
##          1          2          3          4          5          6 
## 0.03231622 0.03878423 0.04156735 0.03131638 0.03741697 0.03801380
summary(eq2 <- lm(DPCE ~ DPI + LDPCE + LDPI + LLRE, data = da))
## 
## Call:
## lm(formula = DPCE ~ DPI + LDPCE + LDPI + LLRE, data = da)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.123371 -0.002630  0.000008  0.003118  0.058894 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.0046064  0.0004308  10.692  < 2e-16 ***
## DPI          0.0389133  0.0267569   1.454   0.1463    
## LDPCE        0.0648587  0.0343478   1.888   0.0594 .  
## LDPI         0.0325459  0.0280296   1.161   0.2459    
## LLRE        -0.0839947  0.0104646  -8.027 3.71e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.007892 on 775 degrees of freedom
##   (2 observations deleted due to missingness)
## Multiple R-squared:  0.097,  Adjusted R-squared:  0.09234 
## F-statistic: 20.81 on 4 and 775 DF,  p-value: 2.601e-16

Cointegration and pairs

The error correction model that was pioneered by Engle and Granger can be used to identify pairs of securities that move together.

da1 <- read.csv('../../EC387/Pairs/Data/Yield.csv', skip = 11, stringsAsFactors = FALSE) 
da1[,1] <- as.Date(da1[, 1], format = "%Y-%m-%d")
da1[, 3] <- as.numeric(da1[, 3])
da1[, 2] <- as.numeric(da1[, 2])
colnames(da1) <- c('Date', 'BY10', 'BY2')
plot(da1$Date, da1$BY10, type = 'l',  main = "US bond yields", 
     xlab = "Date", ylab = 'Yield', ylim = c(0.0, 4.0))
lines(da1$Date, da1$BY2, col = 'red', lty = 2)
legend('topright', inset = 0.01, legend = c("10-year", "2-year"), 
       col = c('black', 'red'), lty = c(1,2))

The difference between the two can be seen here:

plot(da1$Date, da1$BY10 - da1$BY2, type = 'l', main = 'Spread between 10 and 2 year yield', xlab = 'Date', ylab = 'Yield')

summary(eq3 <- lm(da1$BY10 ~ da1$BY2 + seq(1:length(da1$BY10))))
## 
## Call:
## lm(formula = da1$BY10 ~ da1$BY2 + seq(1:length(da1$BY10)))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.81274 -0.22249 -0.05199  0.30128  0.79111 
## 
## Coefficients:
##                           Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              1.7319740  0.0349414   49.57   <2e-16 ***
## da1$BY2                  2.1947874  0.0544909   40.28   <2e-16 ***
## seq(1:length(da1$BY10)) -0.0003093  0.0000279  -11.09   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.356 on 1248 degrees of freedom
##   (53 observations deleted due to missingness)
## Multiple R-squared:  0.6363, Adjusted R-squared:  0.6357 
## F-statistic:  1092 on 2 and 1248 DF,  p-value: < 2.2e-16

The next step is to test whether these residuals are stationary.

summary(ur.df(lm(da1$BY10 ~ da1$BY2 + seq(1:length(da1$BY10)))$residuals, 'none'))
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.167615 -0.028723 -0.001475  0.027048  0.211414 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## z.lag.1    -0.006232   0.003681  -1.693   0.0907 .
## z.diff.lag -0.070625   0.028310  -2.495   0.0127 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04608 on 1247 degrees of freedom
## Multiple R-squared:  0.007808,   Adjusted R-squared:  0.006217 
## F-statistic: 4.907 on 2 and 1247 DF,  p-value: 0.00754
## 
## 
## Value of test-statistic is: -1.6929 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62

It is not possible to reject the null of a unit root. This knocks the belief that there is a long-run relationship between 10 and 2 year yields.

Johansen test

The weakness of the Engle-Granger method is that it has no systematic way of selecting dependent variables, relies on two step estimation (which means that any error in the first step will carry over to the second and becomes increasingly difficult for case where there are more than two variables (which is fine for pairs trading but less useful for many economic theories).

The Johansen technique is a more general process that can accommodate a wider variety of models Johansen and Juselius (1990). It is based on the estimation of a Vector Autoregression (VAR) type model.

\[\boldsymbol{\Delta y_{t}} = \boldsymbol{A y_{t-1}} + \varepsilon_t\]

where \(\boldsymbol{y_t}\) is an (n x 1) vector of variables and $ is an (n x n) matrix of parameters to be estimated. Just like the Dickey-Fuller case, we can re-arrange by taking the vector \(\boldsymbol{y_t}\) from each side to give:

\[\Delta \boldsymbol{y_t} = (\boldsymbol{A} - \boldsymbol{I}) \boldsymbol{y_{t-1}} + \varepsilon_t\] where \(\boldsymbol{I}\) is the identity matrix. Now we test the rank of the matrix \[\boldsymbol{A-I}\]:

Exchange rate example

Here is a constructed exchange rate example

First we create the three random walks for home prices and overseas prices and make the exchange rate a function of the difference between the two as if relative PPP holds and these are all logs.

p <- rep(0, 1000)
for(i in 2:length(p)) p[i] = p[i-1] + rnorm(1)
ps <- rep(0, 1000)
for(i in 2:length(ps)) ps[i] = ps[i-1] + rnorm(1)
e <- p - ps + rnorm(100)
plot(p, type = 'l', main = 'Plot of p, ps and e', 
     ylim = c(min(p, ps, e), max(p, ps, e)))
lines(ps, lty = 2, col = 'red')
lines(e, lty = 3, col = 'blue')
legend('topleft', inset = 0.02, legend = 
         c('p (home price)', 'ps (overseas price', 'e (nominal exchange rate'), 
       lty = c(1, 2, 3), col = c('black', 'red', 'blue'), cex = 0.7)

Now ensure that they are unit root variables (this does not really need to be done as the Johansen test will identify that).

summary(ur.df(p))
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1724 -0.6797 -0.0476  0.6287  2.8711 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## z.lag.1    0.0010067  0.0008476   1.188    0.235
## z.diff.lag 0.0121596  0.0317391   0.383    0.702
## 
## Residual standard error: 1.01 on 996 degrees of freedom
## Multiple R-squared:  0.001625,   Adjusted R-squared:  -0.0003793 
## F-statistic: 0.8108 on 2 and 996 DF,  p-value: 0.4448
## 
## 
## Value of test-statistic is: 1.1877 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62
summary(ur.df(ps))
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5777 -0.7020 -0.0238  0.6288  3.3813 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## z.lag.1    -0.005158   0.003350  -1.540    0.124
## z.diff.lag  0.025720   0.031677   0.812    0.417
## 
## Residual standard error: 1.018 on 996 degrees of freedom
## Multiple R-squared:  0.002898,   Adjusted R-squared:  0.0008961 
## F-statistic: 1.448 on 2 and 996 DF,  p-value: 0.2356
## 
## 
## Value of test-statistic is: -1.5398 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62
summary(ur.df(e))
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -7.541 -1.296 -0.025  1.222  6.261 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## z.lag.1     0.0001685  0.0018205   0.093    0.926    
## z.diff.lag -0.2123733  0.0309972  -6.851 1.28e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.915 on 996 degrees of freedom
## Multiple R-squared:  0.04507,    Adjusted R-squared:  0.04315 
## F-statistic:  23.5 on 2 and 996 DF,  p-value: 1.062e-10
## 
## 
## Value of test-statistic is: 0.0926 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62

Now carry out the Johansen test.

summary(ca.jo(data.frame(p, ps, e), type = 'trace', ecdet = 'none'))
## 
## ###################### 
## # Johansen-Procedure # 
## ###################### 
## 
## Test type: trace statistic , with linear trend 
## 
## Eigenvalues (lambda):
## [1] 0.3535125558 0.0060840087 0.0008885665
## 
## Values of teststatistic and critical values of test:
## 
##            test 10pct  5pct  1pct
## r <= 2 |   0.89  6.50  8.18 11.65
## r <= 1 |   6.98 15.66 17.95 23.52
## r = 0  | 442.31 28.71 31.52 37.22
## 
## Eigenvectors, normalised to first column:
## (These are the cointegration relations)
## 
##             p.l2      ps.l2        e.l2
## p.l2   1.0000000  1.0000000  1.00000000
## ps.l2 -0.9995665 14.6196691 -0.02450146
## e.l2  -1.0004348 -0.0175256  0.20423740
## 
## Weights W:
## (This is the loading matrix)
## 
##             p.l2         ps.l2          e.l2
## p.d   0.01104453 -6.258732e-05 -0.0012902109
## ps.d -0.02303730 -7.857385e-04  0.0001526993
## e.d   1.05676347  7.058472e-04 -0.0013932689

From these results indicate that there is one cointegrating vector (which is the case by design). The test statistic will reject the null hypothesis that the rank of the matrix is zero. However, we cannot reject the null that \(r <= 1\).

The cointegrating vector is \(p - ps - e = 0\) or \(e = p - ps\). If we plot that relationship

plot(p - ps - e, type = 'l', main = 'Cointegrating relationship')

This is a stationary series

summary(ur.df(p - ps- e, type = 'none'))
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.91962 -0.69567 -0.01905  0.68925  2.69085 
## 
## Coefficients:
##            Estimate Std. Error t value Pr(>|t|)    
## z.lag.1    -1.01908    0.04394 -23.191   <2e-16 ***
## z.diff.lag  0.05589    0.03163   1.767   0.0775 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9615 on 996 degrees of freedom
## Multiple R-squared:  0.4842, Adjusted R-squared:  0.4832 
## F-statistic: 467.5 on 2 and 996 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic is: -23.1911 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62

Which rejects the null.

The carjorls function from the urca package will return the OLS regression of a restricted VECM if the ca.jo object and rank is provided.

nomexecc <- ca.jo(data.frame(e, p, ps), type = 'trace', spec = 'longrun')
summary(nomexecc)
## 
## ###################### 
## # Johansen-Procedure # 
## ###################### 
## 
## Test type: trace statistic , with linear trend 
## 
## Eigenvalues (lambda):
## [1] 0.3535125558 0.0060840087 0.0008885665
## 
## Values of teststatistic and critical values of test:
## 
##            test 10pct  5pct  1pct
## r <= 2 |   0.89  6.50  8.18 11.65
## r <= 1 |   6.98 15.66 17.95 23.52
## r = 0  | 442.31 28.71 31.52 37.22
## 
## Eigenvectors, normalised to first column:
## (These are the cointegration relations)
## 
##             e.l2       p.l2      ps.l2
## e.l2   1.0000000    1.00000  1.0000000
## p.l2  -0.9995654  -57.05938  4.8962629
## ps.l2  0.9991320 -834.18921 -0.1199656
## 
## Weights W:
## (This is the loading matrix)
## 
##             e.l2          p.l2         ps.l2
## e.d  -1.05722299 -1.237040e-05 -2.845576e-04
## p.d  -0.01104934  1.096881e-06 -2.635093e-04
## ps.d  0.02304732  1.377054e-05  3.118692e-05
summary(cajools(nomexecc))
## Response e.d :
## 
## Call:
## lm(formula = e.d ~ constant + e.dl1 + p.dl1 + ps.dl1 + e.l2 + 
##     p.l2 + ps.l2 - 1, data = data.mat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.2317 -1.1573  0.0713  1.1591  5.6134 
## 
## Coefficients:
##          Estimate Std. Error t value Pr(>|t|)    
## constant  0.01470    0.12653   0.116    0.908    
## e.dl1    -0.95153    0.05695 -16.707   <2e-16 ***
## p.dl1     0.90853    0.07778  11.681   <2e-16 ***
## ps.dl1   -0.90325    0.07819 -11.552   <2e-16 ***
## e.l2     -1.05752    0.07916 -13.359   <2e-16 ***
## p.l2      1.05608    0.07916  13.341   <2e-16 ***
## ps.l2    -1.04595    0.07949 -13.158   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.729 on 991 degrees of freedom
## Multiple R-squared:  0.2255, Adjusted R-squared:   0.22 
## F-statistic: 41.22 on 7 and 991 DF,  p-value: < 2.2e-16
## 
## 
## Response p.d :
## 
## Call:
## lm(formula = p.d ~ constant + e.dl1 + p.dl1 + ps.dl1 + e.l2 + 
##     p.l2 + ps.l2 - 1, data = data.mat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1633 -0.6403 -0.0223  0.6484  2.9387 
## 
## Coefficients:
##           Estimate Std. Error t value Pr(>|t|)
## constant -0.117920   0.073857  -1.597    0.111
## e.dl1     0.029377   0.033246   0.884    0.377
## p.dl1    -0.016954   0.045401  -0.373    0.709
## ps.dl1    0.053616   0.045641   1.175    0.240
## e.l2     -0.011312   0.046207  -0.245    0.807
## p.l2      0.009692   0.046208   0.210    0.834
## ps.l2    -0.011923   0.046401  -0.257    0.797
## 
## Residual standard error: 1.009 on 991 degrees of freedom
## Multiple R-squared:  0.007663,   Adjusted R-squared:  0.0006539 
## F-statistic: 1.093 on 7 and 991 DF,  p-value: 0.3651
## 
## 
## Response ps.d :
## 
## Call:
## lm(formula = ps.d ~ constant + e.dl1 + p.dl1 + ps.dl1 + e.l2 + 
##     p.l2 + ps.l2 - 1, data = data.mat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5415 -0.6588  0.0058  0.6695  3.4958 
## 
## Coefficients:
##           Estimate Std. Error t value Pr(>|t|)
## constant -0.105798   0.074534  -1.419    0.156
## e.dl1     0.014591   0.033550   0.435    0.664
## p.dl1    -0.005355   0.045817  -0.117    0.907
## ps.dl1    0.031497   0.046060   0.684    0.494
## e.l2      0.023092   0.046631   0.495    0.621
## p.l2     -0.023670   0.046632  -0.508    0.612
## ps.l2     0.011536   0.046826   0.246    0.805
## 
## Residual standard error: 1.019 on 991 degrees of freedom
## Multiple R-squared:  0.006961,   Adjusted R-squared:  -5.329e-05 
## F-statistic: 0.9924 on 7 and 991 DF,  p-value: 0.4352
cajorls(nomexecc, r = 1)$rlm
## 
## Call:
## lm(formula = substitute(form1), data = data.mat)
## 
## Coefficients:
##           e.d        p.d        ps.d     
## ect1      -1.057223  -0.011049   0.023047
## constant  -0.024751  -0.061085  -0.009132
## e.dl1     -0.951413   0.029457   0.014556
## p.dl1      0.908828  -0.016041  -0.004758
## ps.dl1    -0.908545   0.054191   0.037496
cajorls(nomexecc, r = 1)$beta
##             ect1
## e.l2   1.0000000
## p.l2  -0.9995654
## ps.l2  0.9991320

Exercise 2.0

  • Compare the change in the exchange rate with that forecast by the error correction model.

  • Use the PPP data (on My Studies) to assess Purchasing Power Parity for the Indian Rupee vs US Dollar Exchange rate.

Bibliography

Engle, Robert F, and Clive WJ Granger. 1987. “Co-Integration and Error Correction: Representation, Estimation, and Testing.” Econometrica: Journal of the Econometric Society, 251–76.
Johansen, Soren, and Katarina Juselius. 1990. “Maximum Likelihood Estimation and Inference on Cointegration - with Applications to the Demad for Money.” Oxford Bulletin of Economics and Statistics 52 (2): 169–210.