Four ways to perform a Granger Causality test with R

The Granger causal test is widely used in time series analysis, but the various ways in which it may be performed in R are poorly documented. I published before a function to perform the Toda-Yamamoto bivariate and multivariate versions of the Granger causality test (see at Toda-Yamamoto-Causality-Test). In the few lines that follow, I am going to focus on its standard version, suggesting four different approaches.

For starters, a definition. A time series X is said to Granger cause another time series Y if past values of X and Y predict Y significantly better than past values of Y only. (Granger, 1969, click here for other details). We can test that hypothesis using linear regression modeling and a Wald test for linear restrictions. The Wald test compares the performance of a restricted model for Y, which excludes X, to an unrestricted model for Y, which includes X.

Firstly, let’s create two simulated time series. It is important to note that the data are represented as time series using the base R ts function. Some functions described here do not work well if the data is not in this format. Also note that the Granger causality test has several assumptions. I’m not going to go into that. I just add that it’s based on linear modeling, so besides specific assumptions of the test, the usual linear regression assumptions apply to it.

set.seed(1234)

ts1 <- ts(rnorm(n=5000))
ts2 <- ts(rnorm(n=5000))

Both series are independently drawn from a standard normal distribution with no relationship to one another. We therefore have no reason to expect either series to Granger-cause the other — any significant test result would be a false positive.

?plot
Help on topic 'plot' was found in the following packages:

  Package               Library
  graphics              /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
  base                  /Library/Frameworks/R.framework/Resources/library


Using the first match ...
plot(ts1, main = "Simulated Series 1")

plot(ts2, main = "Simulated Series 2")

First approach: lmtest::grangertest

The first way to perform a Granger causality test in R is to use the grangertest function provided by the lmtest package. We can test if the series ts1 Granger-causes the series ts2 with a simple line of code.

lmtest::grangertest(ts1, ts2, order=3, test="F")
Granger causality test

Model 1: ts2 ~ Lags(ts2, 1:3) + Lags(ts1, 1:3)
Model 2: ts2 ~ Lags(ts2, 1:3)
  Res.Df Df      F Pr(>F)
1   4990                 
2   4993 -3 1.6216 0.1821

The null hypothesis is that ts1 does not Granger-cause ts2 — i.e., lagged values of ts1 provide no additional predictive power for ts2 beyond its own lags. As expected given the data-generating process, we fail to reject the null (p > 0.05).

While grangertest is a quick way to perform a Granger causal test, it is quite rough, as it does not rely on any modelling of the series. It is important to fit an appropriate linear model to the series, also just to check if the assumptions hold.

Second approach: vars::VAR + vars::causality

A commonly used approach is to fit a VAR model — a system of linear regression models in which each variable is regressed on its own lags and the lags of all other variables — using the vars package, and then test Granger causality using the package’s own causality function.

tsDat <- ts.union(ts1, ts2) 
tsVAR <- vars::VAR(tsDat, p = 3)
vars::causality(tsVAR, cause = "ts1")$Granger

    Granger causality H0: ts1 do not Granger-cause ts2

data:  VAR object tsVAR
F-Test = 1.6216, df1 = 3, df2 = 9980, p-value = 0.182

The null hypothesis is that lagged values of ts1 do not jointly improve the forecast of ts2 in the VAR system — equivalently, that all coefficients on lags of ts1 in the equation for ts2 are zero. We fail to reject the null, consistent with the simulated independence of the two series.

The causality function offers several interesting options. For example, it implements a bootstrapping procedure. It is also possible to specify the heteroscedasticity-consistent estimation of the covariance matrix of the coefficient estimates in the regression models (HC), but other types of estimation of the covariance matrix (e.g., HAC) are not supported.

Third approach: dynlm::dynlm + lmtest::waldtest

A more flexible approach makes the logic of the Granger test explicit by fitting the two competing models directly and comparing them with a Wald test.

The Wald test asks: does adding the lagged values of ts1 to the model for ts2 significantly improve fit? To answer this, we fit two models:

  • A restricted model, which predicts ts2 using only its own past lags. This embodies the null hypothesis that ts1 contributes nothing.
  • An unrestricted model, which additionally includes past lags of ts1. This is the alternative — that ts1 does carry predictive information.

The Wald F-test then evaluates whether the improvement in fit from adding those extra terms is larger than we would expect by chance. A significant result (small p-value) means we reject the null and conclude that ts1 Granger-causes ts2. Both models can be fitted with the dynlm package, which provides a convenient lag operator L() for time series regression.

library(dynlm)
unrestricted_ts2 <- dynlm(ts2 ~ L(ts2, 1:3) + L(ts1, 1:3))
restricted_ts2   <- dynlm(ts2 ~ L(ts2, 1:3))

lmtest::waldtest(unrestricted_ts2, restricted_ts2, test="F")                  
Wald test

Model 1: ts2 ~ L(ts2, 1:3) + L(ts1, 1:3)
Model 2: ts2 ~ L(ts2, 1:3)
  Res.Df Df      F Pr(>F)
1   4990                 
2   4993 -3 1.6216 0.1821

The null hypothesis is that the three coefficients on lagged ts1 in the unrestricted model are jointly zero. We fail to reject the null (p > 0.05), consistent with the independence of the two series.

This approach has the advantage that the series can be modeled in a flexible manner before testing Granger’s causality — for instance, by adding a seasonal term or other control variables when necessary.

Moreover, this approach permits the specification of heteroscedasticity-consistent (HC) or heteroscedasticity and autocorrelation consistent standard errors (HAC), also in the Newey-West form, using the functions provided by the sandwich package.

lmtest::waldtest(unrestricted_ts2, restricted_ts2, test="F", vcov = sandwich::vcovHC) 
Wald test

Model 1: ts2 ~ L(ts2, 1:3) + L(ts1, 1:3)
Model 2: ts2 ~ L(ts2, 1:3)
  Res.Df Df      F Pr(>F)
1   4990                 
2   4993 -3 1.6298 0.1802
lmtest::waldtest(unrestricted_ts2, restricted_ts2, test="F", vcov = sandwich::vcovHAC) 
Wald test

Model 1: ts2 ~ L(ts2, 1:3) + L(ts1, 1:3)
Model 2: ts2 ~ L(ts2, 1:3)
  Res.Df Df      F Pr(>F)
1   4990                 
2   4993 -3 1.6577 0.1739
lmtest::waldtest(unrestricted_ts2, restricted_ts2, test="F", vcov = sandwich::NeweyWest) 
Wald test

Model 1: ts2 ~ L(ts2, 1:3) + L(ts1, 1:3)
Model 2: ts2 ~ L(ts2, 1:3)
  Res.Df Df      F Pr(>F)
1   4990                 
2   4993 -3 1.6737 0.1704

Across all three covariance specifications, we fail to reject the null, reinforcing the conclusion that no Granger causal relationship exists between the two series.

Fourth approach: dynlm::dynlm + aod::wald.test

The fourth approach is a variant of the third. A Wald test can also be performed with the wald.test function in the aod package. Rather than comparing two fitted model objects, this function operates directly on the coefficient vector and covariance matrix of the unrestricted model, requiring you to manually specify which coefficients to test via the Terms argument.

aod::wald.test(b=coef(unrestricted_ts2), 
               Sigma=vcov(unrestricted_ts2),
               Terms = 5:7,
               verbose = T)
Wald test:
----------

Coefficients:
 (Intercept) L(ts2, 1:3)1 L(ts2, 1:3)2 L(ts2, 1:3)3 L(ts1, 1:3)1 L(ts1, 1:3)2 
      0.0191      -0.0037       0.0120      -0.0063       0.0085       0.0294 
L(ts1, 1:3)3 
      0.0043 

Var-cov matrix of the coefficients:
             (Intercept) L(ts2, 1:3)1 L(ts2, 1:3)2 L(ts2, 1:3)3 L(ts1, 1:3)1
(Intercept)   1.9e-04    -3.8e-06     -3.8e-06     -3.7e-06      1.3e-06    
L(ts2, 1:3)1 -3.8e-06     2.0e-04      6.4e-07     -2.4e-06     -6.8e-06    
L(ts2, 1:3)2 -3.8e-06     6.4e-07      2.0e-04      6.0e-07      2.5e-06    
L(ts2, 1:3)3 -3.7e-06    -2.4e-06      6.0e-07      2.0e-04     -1.5e-08    
L(ts1, 1:3)1  1.3e-06    -6.8e-06      2.5e-06     -1.5e-08      2.0e-04    
L(ts1, 1:3)2  1.3e-06    -1.7e-06     -6.8e-06      2.5e-06     -1.5e-06    
L(ts1, 1:3)3  1.5e-06    -5.9e-06     -1.7e-06     -6.8e-06     -1.4e-06    
             L(ts1, 1:3)2 L(ts1, 1:3)3
(Intercept)   1.3e-06      1.5e-06    
L(ts2, 1:3)1 -1.7e-06     -5.9e-06    
L(ts2, 1:3)2 -6.8e-06     -1.7e-06    
L(ts2, 1:3)3  2.5e-06     -6.8e-06    
L(ts1, 1:3)1 -1.5e-06     -1.4e-06    
L(ts1, 1:3)2  2.0e-04     -1.5e-06    
L(ts1, 1:3)3 -1.5e-06      2.0e-04    

Test-design matrix:
   (Intercept) L(ts2, 1:3)1 L(ts2, 1:3)2 L(ts2, 1:3)3 L(ts1, 1:3)1 L(ts1, 1:3)2
L1           0            0            0            0            1            0
L2           0            0            0            0            0            1
L3           0            0            0            0            0            0
   L(ts1, 1:3)3
L1            0
L2            0
L3            1

Positions of tested coefficients in the vector of coefficients: 5, 6, 7 

H0:  L(ts1, 1:3)1 = 0; L(ts1, 1:3)2 = 0; L(ts1, 1:3)3 = 0 

Chi-squared test:
X2 = 4.9, df = 3, P(> X2) = 0.18

Here, Terms = 5:7 selects the three coefficients on lagged ts1 in the unrestricted model. The null hypothesis is identical to the previous approach: these coefficients are jointly zero. By default, wald.test reports a Chi-square statistic rather than an F statistic (the Chi-squared version can also be obtained from waldtest in lmtest by setting test="Chisq", or via the linearHypothesis function in the car package). We again fail to reject the null.

This extra flexibility — working directly with coefficients rather than model objects — can be useful in cases where the model structure does not fit neatly into the two-model comparison framework of waldtest.

A critical caveat: stationarity and spurious Granger causality

The four approaches above all behaved as expected because the series were stationary white noise — they had no trend, no unit root, and no shared dynamics. In practice, many economic and social time series are random walks (integrated of order 1, or I(1)), and applying the standard Granger test to such series in levels is known to produce spurious results. This section illustrates exactly that failure mode, and shows how differencing resolves it.

Independent random walks → spurious Granger causality

We simulate two independent random walks: each is simply the cumulative sum of white noise innovations, with no connection to the other.

set.seed(1234)
rw1 <- ts(cumsum(rnorm(n = 5000)))
rw2 <- ts(cumsum(rnorm(n = 5000)))

Plotting the two series immediately reveals the problem: both wander over time, and by chance their long-run trajectories may move together or in opposite directions for extended periods, even though they share no causal relationship.

plot(rw1, main = "Random Walk 1 (levels)", ylab = "Value")

plot(rw2, main = "Random Walk 2 (levels)", ylab = "Value")

Despite being completely independent by construction, the Granger test on the levels of these series will frequently — and incorrectly — reject the null hypothesis. This is spurious Granger causality: the test picks up the shared non-stationarity of the series rather than any genuine predictive relationship.

lmtest::grangertest(rw1, rw2, order = 3, test = "F")
Granger causality test

Model 1: rw2 ~ Lags(rw2, 1:3) + Lags(rw1, 1:3)
Model 2: rw2 ~ Lags(rw2, 1:3)
  Res.Df Df      F   Pr(>F)   
1   4990                      
2   4993 -3 4.7315 0.002675 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The null hypothesis is that rw1 does not Granger-cause rw2. Here we reject the null (p < 0.05) — but this is a false positive. The two series are independent by construction; the significant result is an artefact of applying a test that assumes stationarity to non-stationary data.

Differenced random walks → no Granger causality

The standard remedy for I(1) series is to first-difference them before testing. First-differencing a random walk recovers the underlying white noise innovations, which are stationary.

rw1_diff <- diff(rw1)
rw2_diff <- diff(rw2)

The differenced series now look as expected from white noise — flat, with no persistent drift or trend.

plot(rw1_diff, main = "Random Walk 1 (first-differenced)", ylab = "Value")

plot(rw2_diff, main = "Random Walk 2 (first-differenced)", ylab = "Value")

Running the Granger test on the differenced series, we now correctly fail to reject the null.

lmtest::grangertest(rw1_diff, rw2_diff, order = 3, test = "F")
Granger causality test

Model 1: rw2_diff ~ Lags(rw2_diff, 1:3) + Lags(rw1_diff, 1:3)
Model 2: rw2_diff ~ Lags(rw2_diff, 1:3)
  Res.Df Df      F Pr(>F)
1   4989                 
2   4992 -3 1.6242 0.1815

The null hypothesis is again that rw1_diff does not Granger-cause rw2_diff. We fail to reject the null (p > 0.05) — the correct result given that the two series are independent. Differencing has removed the shared non-stationarity that was generating the spurious finding.

Key takeaway: Before applying any Granger causality test, check whether your series are stationary (e.g., using an ADF or KPSS test). If they are I(1), difference them first. Note that if the series are cointegrated, differencing throws away important long-run information and a Vector Error Correction Model (VECM) framework is more appropriate — but that is beyond the scope of this document.

Appendix

https://github.com/nicolarighetti/Granger-causality-test-with-R/blob/main/Granger-test-with-R.md