Causality is the agency of efficacy that connects one process (the cause) with another process or state (the effect), where the first is understood to be partly repsonsible for the second, and the second is dependent on the first. In gegeral, a process has many causes, which are said to be causal factors for it, and all lie in its past. An effect can in turn be a cause of many other effects.
The Granger causality is a statistical hypothesis test for determing whether one time series is useful in forecasting another. A time series X is said to Granger-cause Y if it can be shown that those X values provide statistically significant information about future values of Y through a series of t-tests and F-tests on lagged values of X. * * * ## Principle and theory
There are two underlying principles: 1. The cause happens prior to its effect. 2. THe cause has unique information about the future values of its effect.
\[P[Y(t+1) \in A | I(t)] \neq P[Y(t+1) \in I_{-X}(t)]\] where P referes to probability, A is an arbitrary non-empty set and I(t) and \(I_{-X}(t)\) respectively denote the information available as of time t in the entire universe, and that in the modified universe in which X is executed. If the above hypothesis is accepted, we say that X Granger-causes Y.
Let y and x be stationary time series. To test null hypothesis that x does not Granger-cause y, one first finds the proper lagged values of y to include in a univariate autoregression of y: \[y_t = a_0 + a_1y_{t-1} + a_2y_{t-2} + ... + a_my_{t-m} +error_t\] next, the autoregression is augmented by including lagged values of x: \[y_t = a_0 + a_1y_{t-1} + a_2y_{t-2} + ... + a_my_{t-m} +error_t + b_px_{t-p}+...+b_qx_{t-q}+error_t.\] The null hypothesis that x does not Granger-cause y is not rejected f and only if no lagged values of x are retained in the regression.
Multivariate Granger causality analysis is usually performed by fitting a vector autoregressive model (VAR) to the time series. \[X(t) = \sum_{\tau=1}^{L}{A_\tau X(t-\tau) + \epsilon(t)}\] where \(\epsilon(t)\) is a white Gaussian random vector and \(A_\tau\) is a mtrix for every \(\tau\). If at least one of \(A_\tau(j,i)\) for \(\tau = 1 ... L\) is significantly larger than zero (in absolute value), \(X_i\) is called Granger causes of another time series \(X_j\).
The test is usually done in Wald or F test.
\[F = \frac{\frac{RSS_1-RSS_2}{P_2-P_1}}{\frac{RSS_2}{n-p_2}}\] The null hypothesis (model 2 is not better than model 1) is rejected if the F calculated from the data is greater than the critical value of the F-distribution for some desired false-rejection probability (e.g. 0.05). The F-test is a Wald test.
In the following example, a granger-causality test using the Toda-Yamamoto method are implemented. ref: https://www.christophpfeiffer.org/2012/11/07/toda-yamamoto-implementation-in-r/
The recommended steps: * 1. find the integration order I(k) using ndiffs * 2. select optimal lag order p(1…m) using VARselect which may return more than one,max =4 values * 3. Build the VAR model, do serial.test on the VAR.p models and select the one with the most serial correlation for VAR models * 4. Build an augumented VAR (p+k) * 5. Run causality test from VARS model * 6. Save the two-way pairwise results to an dataframe
library(fUnitRoots)
## Loading required package: urca
## Loading required package: timeDate
## Loading required package: timeSeries
## Loading required package: fBasics
##
## Rmetrics Package fBasics
## Analysing Markets and calculating Basic Statistics
## Copyright (C) 2005-2014 Rmetrics Association Zurich
## Educational Software for Financial Engineering and Computational Science
## Rmetrics is free software and comes with ABSOLUTELY NO WARRANTY.
## https://www.rmetrics.org --- Mail to: info@rmetrics.org
##
## Attaching package: 'fUnitRoots'
## The following objects are masked from 'package:urca':
##
## punitroot, qunitroot, unitrootTable
library(urca)
library(vars)
## Loading required package: MASS
## Loading required package: strucchange
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following object is masked from 'package:timeSeries':
##
## time<-
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: sandwich
## Loading required package: lmtest
library(aod)
library(zoo)
library(tseries)
cof <- read.csv("coffee_data.csv", header=T,sep=";")
names(cof)
## [1] "Date" "Arabica" "Robusta"
#Adjust Date format
cof["Date"]<-paste(sub("M","-",cof$Date),"-01",sep="")
#Visualize
plot(as.Date(cof$Date),cof$Arabica,type="l",col="black",lwd=2)
lines(as.Date(cof$Date),cof$Robusta,col="blue",lty=2,lwd=1)
legend("topleft",c("Arabica","Robusta"),col=c("black","blue"),lty=c(1,2),lwd=c(2,1),bty="n")
cof1<-cof[193:615,]
#Visualize
plot(as.Date(cof1$Date),cof1$Arabica,type="l",col="black",lwd=2,ylim=range(cof1$Robusta))
lines(as.Date(cof1$Date),cof1$Robusta,col="blue",lty=2,lwd=1)
legend("topright",c("Arabica","Robusta"),col=c("black","blue"),lty=c(1,2),lwd=c(2,1),bty="n")
#Test for unit roots
summary(ur.df(cof$Arabica))
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -136.541 -5.926 0.128 7.416 153.326
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -0.001184 0.003383 -0.350 0.726
## z.diff.lag 0.282005 0.039025 7.226 1.49e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 21.43 on 611 degrees of freedom
## Multiple R-squared: 0.07893, Adjusted R-squared: 0.07591
## F-statistic: 26.18 on 2 and 611 DF, p-value: 1.236e-11
##
##
## Value of test-statistic is: -0.35
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau1 -2.58 -1.95 -1.62
#Test for unit roots
summary(ur.df(cof$Robusta))
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression none
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -96.278 -3.314 0.416 5.359 109.203
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## z.lag.1 -0.004149 0.003236 -1.282 0.2
## z.diff.lag 0.372268 0.037648 9.888 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.59 on 611 degrees of freedom
## Multiple R-squared: 0.1386, Adjusted R-squared: 0.1358
## F-statistic: 49.15 on 2 and 611 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -1.2821
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau1 -2.58 -1.95 -1.62
#Test for unit root using kpss
summary(ur.kpss(cof$Arabica))
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 6 lags.
##
## Value of test-statistic is: 2.2933
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
summary(ur.kpss(cof$Arabica))
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 6 lags.
##
## Value of test-statistic is: 2.2933
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
ur.df(diff(cof$Arabica,1))
##
## ###############################################################
## # Augmented Dickey-Fuller Test Unit Root / Cointegration Test #
## ###############################################################
##
## The value of the test statistic is: -13.6431
ur.df(diff(cof$Robusta,1))
##
## ###############################################################
## # Augmented Dickey-Fuller Test Unit Root / Cointegration Test #
## ###############################################################
##
## The value of the test statistic is: -14.6622
ur.kpss(diff(cof$Arabica,1))
##
## #######################################
## # KPSS Unit Root / Cointegration Test #
## #######################################
##
## The value of the test statistic is: 0.0897
ur.kpss(diff(cof$Robusta,1))
##
## #######################################
## # KPSS Unit Root / Cointegration Test #
## #######################################
##
## The value of the test statistic is: 0.042
VARselect(cof1[,2:3],lag=20,type="both")
## $selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 6 2 2 6
##
## $criteria
## 1 2 3 4 5
## AIC(n) 11.07263 10.94173 10.94208 10.95668 10.95945
## HQ(n) 11.10405 10.98887 11.00493 11.03525 11.05373
## SC(n) 11.15201 11.06080 11.10085 11.15513 11.19760
## FPE(n) 64384.40808 56485.15349 56505.29955 57336.74180 57496.76634
## 6 7 8 9 10
## AIC(n) 10.93581 10.95072 10.95527 10.95470 10.94995
## HQ(n) 11.04580 11.07643 11.09669 11.11184 11.12280
## SC(n) 11.21365 11.26825 11.31249 11.35162 11.38656
## FPE(n) 56154.60934 56999.54484 57261.61258 57231.72097 56963.70710
## 11 12 13 14 15
## AIC(n) 10.95840 10.97315 10.99035 11.00656 11.00542
## HQ(n) 11.14696 11.17743 11.21035 11.24227 11.25684
## SC(n) 11.43470 11.48914 11.54604 11.60194 11.64049
## FPE(n) 57450.43593 58308.89763 59325.87222 60301.75016 60240.00400
## 16 17 18 19 20
## AIC(n) 11.01678 11.02608 11.02947 11.04578 11.05028
## HQ(n) 11.28391 11.30893 11.32804 11.36006 11.38027
## SC(n) 11.69154 11.74053 11.78362 11.83961 11.88380
## FPE(n) 60936.28891 61514.98344 61734.49667 62761.22078 63057.02316
##VAR Model, lag=2
V.2<-VAR(cof1[,2:3],p=2,type="both")
st <- serial.test(V.2)
str(st)
## List of 2
## $ resid : num [1:421, 1:2] -6.63 47 9.04 20.53 -36.04 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : chr [1:421] "1" "2" "3" "4" ...
## .. ..$ : chr [1:2] "Arabica" "Robusta"
## $ serial:List of 5
## ..$ statistic: Named num 106
## .. ..- attr(*, "names")= chr "Chi-squared"
## ..$ parameter: Named num 56
## .. ..- attr(*, "names")= chr "df"
## ..$ p.value : Named num 5.64e-05
## .. ..- attr(*, "names")= chr "Chi-squared"
## ..$ method : chr "Portmanteau Test (asymptotic)"
## ..$ data.name: chr "Residuals of VAR object V.2"
## ..- attr(*, "class")= chr "htest"
## - attr(*, "class")= chr "varcheck"
##VAR-Model, lag=6
V.6<-VAR(cof1[,2:3],p=6,type="both")
serial.test(V.6)
##
## Portmanteau Test (asymptotic)
##
## data: Residuals of VAR object V.6
## Chi-squared = 37.242, df = 40, p-value = 0.5951
#Stability analysis
1/roots(V.6)[[1]] # ">1"
## [1] 1.033862
1/roots(V.6)[[2]] # ">1"
## [1] 1.15703
#Alternative stability analyis
plot(stability(V.6)) ## looks fine
V.7<-VAR(cof1[,2:3],p=7,type="both")
V.7$varresult
## $Arabica
##
## Call:
## lm(formula = y ~ -1 + ., data = datamat)
##
## Coefficients:
## Arabica.l1 Robusta.l1 Arabica.l2 Robusta.l2 Arabica.l3 Robusta.l3
## 1.150779 0.132064 -0.036325 -0.204472 -0.232307 0.273024
## Arabica.l4 Robusta.l4 Arabica.l5 Robusta.l5 Arabica.l6 Robusta.l6
## 0.090439 -0.279599 -0.052212 0.008847 0.185130 -0.119799
## Arabica.l7 Robusta.l7 const trend
## -0.139132 0.196647 8.097807 0.003104
##
##
## $Robusta
##
## Call:
## lm(formula = y ~ -1 + ., data = datamat)
##
## Coefficients:
## Arabica.l1 Robusta.l1 Arabica.l2 Robusta.l2 Arabica.l3 Robusta.l3
## -0.04887 1.42130 0.20041 -0.65652 -0.22869 0.29542
## Arabica.l4 Robusta.l4 Arabica.l5 Robusta.l5 Arabica.l6 Robusta.l6
## 0.07800 -0.12070 -0.09070 0.12401 0.20479 -0.24496
## Arabica.l7 Robusta.l7 const trend
## -0.09932 0.12843 11.11522 -0.02227
summary(V.7)
##
## VAR Estimation Results:
## =========================
## Endogenous variables: Arabica, Robusta
## Deterministic variables: both
## Sample size: 416
## Log Likelihood: -3496.75
## Roots of the characteristic polynomial:
## 0.9593 0.8435 0.8435 0.7787 0.7787 0.6966 0.6966 0.6274 0.6274 0.6102 0.6102 0.5987 0.5756 0.1638
## Call:
## VAR(y = cof1[, 2:3], p = 7, type = "both")
##
##
## Estimation results for equation Arabica:
## ========================================
## Arabica = Arabica.l1 + Robusta.l1 + Arabica.l2 + Robusta.l2 + Arabica.l3 + Robusta.l3 + Arabica.l4 + Robusta.l4 + Arabica.l5 + Robusta.l5 + Arabica.l6 + Robusta.l6 + Arabica.l7 + Robusta.l7 + const + trend
##
## Estimate Std. Error t value Pr(>|t|)
## Arabica.l1 1.150779 0.084136 13.678 <2e-16 ***
## Robusta.l1 0.132064 0.116569 1.133 0.2579
## Arabica.l2 -0.036325 0.131807 -0.276 0.7830
## Robusta.l2 -0.204472 0.194898 -1.049 0.2948
## Arabica.l3 -0.232307 0.134396 -1.729 0.0847 .
## Robusta.l3 0.273024 0.204930 1.332 0.1835
## Arabica.l4 0.090439 0.134762 0.671 0.5025
## Robusta.l4 -0.279599 0.205428 -1.361 0.1743
## Arabica.l5 -0.052212 0.134478 -0.388 0.6980
## Robusta.l5 0.008847 0.203831 0.043 0.9654
## Arabica.l6 0.185130 0.130472 1.419 0.1567
## Robusta.l6 -0.119799 0.193642 -0.619 0.5365
## Arabica.l7 -0.139132 0.084410 -1.648 0.1001
## Robusta.l7 0.196647 0.116618 1.686 0.0925 .
## const 8.097807 6.108936 1.326 0.1857
## trend 0.003104 0.016641 0.187 0.8521
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 25.2 on 400 degrees of freedom
## Multiple R-Squared: 0.9461, Adjusted R-squared: 0.9441
## F-statistic: 468.1 on 15 and 400 DF, p-value: < 2.2e-16
##
##
## Estimation results for equation Robusta:
## ========================================
## Robusta = Arabica.l1 + Robusta.l1 + Arabica.l2 + Robusta.l2 + Arabica.l3 + Robusta.l3 + Arabica.l4 + Robusta.l4 + Arabica.l5 + Robusta.l5 + Arabica.l6 + Robusta.l6 + Arabica.l7 + Robusta.l7 + const + trend
##
## Estimate Std. Error t value Pr(>|t|)
## Arabica.l1 -0.04887 0.06093 -0.802 0.4230
## Robusta.l1 1.42130 0.08442 16.835 < 2e-16 ***
## Arabica.l2 0.20041 0.09546 2.099 0.0364 *
## Robusta.l2 -0.65652 0.14115 -4.651 4.49e-06 ***
## Arabica.l3 -0.22869 0.09733 -2.350 0.0193 *
## Robusta.l3 0.29542 0.14842 1.990 0.0472 *
## Arabica.l4 0.07800 0.09760 0.799 0.4247
## Robusta.l4 -0.12070 0.14878 -0.811 0.4177
## Arabica.l5 -0.09070 0.09739 -0.931 0.3523
## Robusta.l5 0.12401 0.14762 0.840 0.4014
## Arabica.l6 0.20479 0.09449 2.167 0.0308 *
## Robusta.l6 -0.24496 0.14024 -1.747 0.0815 .
## Arabica.l7 -0.09932 0.06113 -1.625 0.1050
## Robusta.l7 0.12843 0.08446 1.521 0.1291
## const 11.11522 4.42435 2.512 0.0124 *
## trend -0.02227 0.01205 -1.848 0.0654 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 18.25 on 400 degrees of freedom
## Multiple R-Squared: 0.9702, Adjusted R-squared: 0.9691
## F-statistic: 867.7 on 15 and 400 DF, p-value: < 2.2e-16
##
##
##
## Covariance matrix of residuals:
## Arabica Robusta
## Arabica 634.9 370.5
## Robusta 370.5 333.0
##
## Correlation matrix of residuals:
## Arabica Robusta
## Arabica 1.0000 0.8057
## Robusta 0.8057 1.0000
wald.test(b=coef(V.7$varresult[[1]]), Sigma=vcov(V.7$varresult[[1]]), Terms=c(2,4,6,8,10,12))
## Wald test:
## ----------
##
## Chi-squared test:
## X2 = 8.6, df = 6, P(> X2) = 0.2
wald.test(b=coef(V.7$varresult[[2]]), Sigma=vcov(V.7$varresult[[2]]), Terms= c(1,3,5,7,9,11))
## Wald test:
## ----------
##
## Chi-squared test:
## X2 = 12.3, df = 6, P(> X2) = 0.056
# use causality function under vars
causality(V.7,'Robusta')
## $Granger
##
## Granger causality H0: Robusta do not Granger-cause Arabica
##
## data: VAR object V.7
## F-Test = 1.2416, df1 = 7, df2 = 800, p-value = 0.2771
##
##
## $Instant
##
## H0: No instantaneous causality between: Robusta and Arabica
##
## data: VAR object V.7
## Chi-squared = 163.76, df = 1, p-value < 2.2e-16
# use causality function under vars
causality(V.7,'Arabica')
## $Granger
##
## Granger causality H0: Arabica do not Granger-cause Robusta
##
## data: VAR object V.7
## F-Test = 1.8527, df1 = 7, df2 = 800, p-value = 0.07446
##
##
## $Instant
##
## H0: No instantaneous causality between: Arabica and Robusta
##
## data: VAR object V.7
## Chi-squared = 163.76, df = 1, p-value < 2.2e-16
# test for var(6) which does not count the cointegration
causality(V.6,'Robusta')
## $Granger
##
## Granger causality H0: Robusta do not Granger-cause Arabica
##
## data: VAR object V.6
## F-Test = 0.85024, df1 = 6, df2 = 806, p-value = 0.5313
##
##
## $Instant
##
## H0: No instantaneous causality between: Robusta and Arabica
##
## data: VAR object V.6
## Chi-squared = 164.49, df = 1, p-value < 2.2e-16
# test for var(6)
causality(V.6,'Arabica')
## $Granger
##
## Granger causality H0: Arabica do not Granger-cause Robusta
##
## data: VAR object V.6
## F-Test = 1.7365, df1 = 6, df2 = 806, p-value = 0.1096
##
##
## $Instant
##
## H0: No instantaneous causality between: Arabica and Robusta
##
## data: VAR object V.6
## Chi-squared = 164.49, df = 1, p-value < 2.2e-16