We can simply generate a random walk in R using the following command :
random_walk <- cumsum(rnorm(2049))
The following code generates a function that simulates TVAR processes
#code generates a time varying AR process with AR coefficients ~ unif
tv_ar <- function(n,w=0.2)
{
#n= length of series, w = width of window around unit root
#generates random noise
e <- rnorm(n,0,1)
#generate AR coefficients
a <- (1-w/2)+runif(n,0,w)
x<- vector("double",length=n)
#recursion
for (t in 2:n)
{
x[t]<- a[t-1]*x[t-1]+e[t]
}
x
}
To simulate a TVAR with length 2024 and width of AR coefficient window around the unit root we simply execute the following command
TVAR <- tv_ar(2049)
You can see below two stochastic paths of both processes:
These are the auto-correlation estimates for both series (above) as well as of their first differences (below)
par(mfrow=c(2,2))
acf(random_walk)
acf(TVAR)
acf(diff(random_walk))
acf(diff(TVAR))
Now lets run some regressions: We regress the first series against their lagged ones by one.
reg1 <-lm(random_walk[length(random_walk):2] ~ random_walk[(length(random_walk)-1):1])
reg2 <-lm(TVAR[length(TVAR):2] ~ TVAR[(length(TVAR)-1):1])
summary(reg1)
##
## Call:
## lm(formula = random_walk[length(random_walk):2] ~ random_walk[(length(random_walk) -
## 1):1])
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.9763 -0.6745 -0.0201 0.6917 3.7468
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 0.004640 0.022128 0.21
## random_walk[(length(random_walk) - 1):1] 0.998003 0.001652 603.98
## Pr(>|t|)
## (Intercept) 0.834
## random_walk[(length(random_walk) - 1):1] <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9836 on 2046 degrees of freedom
## Multiple R-squared: 0.9944, Adjusted R-squared: 0.9944
## F-statistic: 3.648e+05 on 1 and 2046 DF, p-value: < 2.2e-16
Slope too close to one, consistent with the random walk hypothesis
summary(reg2)
##
## Call:
## lm(formula = TVAR[length(TVAR):2] ~ TVAR[(length(TVAR) - 1):1])
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.3086 -1.2392 -0.0149 1.2954 10.8956
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.098121 0.078777 1.246 0.213
## TVAR[(length(TVAR) - 1):1] 0.996405 0.001895 525.673 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.632 on 2046 degrees of freedom
## Multiple R-squared: 0.9927, Adjusted R-squared: 0.9926
## F-statistic: 2.763e+05 on 1 and 2046 DF, p-value: < 2.2e-16
Slope again too close to one
Let’s run Augmented Dickey Fuller tests
library(tseries)
adf.test(random_walk,k=1)
##
## Augmented Dickey-Fuller Test
##
## data: random_walk
## Dickey-Fuller = -1.2995, Lag order = 1, p-value = 0.8749
## alternative hypothesis: stationary
and for the second series
adf.test(TVAR,k=1)
##
## Augmented Dickey-Fuller Test
##
## data: TVAR
## Dickey-Fuller = -1.6986, Lag order = 1, p-value = 0.7059
## alternative hypothesis: stationary
In both cases, the unit root hypothesis was NOT rejected.
Lets see how the periodograms looks like.
par(mfrow=c(2,2))
spec.pgram(random_walk)
spec.pgram(TVAR)
spec.pgram(diff(random_walk))
spec.pgram(diff(TVAR))
Periodograms look very similar. And first differences too much like white noise. wE Don’t see and any dominant frequencies
But remember ADF tests have low power. Lets try the KPSS. Rejection of the null hypothesis could then be viewed as a convincing evidence in favor of a unit root (or not ??).
require(tseries)
kpss.test(random_walk)
## Warning in kpss.test(random_walk): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: random_walk
## KPSS Level = 3.357, Truncation lag parameter = 8, p-value = 0.01
kpss.test(TVAR)
## Warning in kpss.test(TVAR): p-value smaller than printed p-value
##
## KPSS Test for Level Stationarity
##
## data: TVAR
## KPSS Level = 5.6447, Truncation lag parameter = 8, p-value = 0.01
KPSS rejects null in both series…
Lets see whether a plot could help.Scatter-plot of consecutive values of observations for each series and corresponding regression lines.
plot(random_walk[length(random_walk):2] ~ random_walk[(length(random_walk)-1):1],col=3)
abline(reg1)
plot(TVAR[length(TVAR):2] ~ TVAR[(length(TVAR)-1):1],col=4)
abline(reg2)
Till now all seem to suggest both are random walks.
To sum up: -ACF show no auto-correlation on the differenced series -Periodograms(differenced series) look (somehow) like white noise -Regressions have slopes too close to 1 -Both unit root tests imply compatibility with the random walk hypothesis
But I’ve got more tests for you.
r Box.test(diff(random_walk), type = "Box-Pierce")
## ## Box-Pierce test ## ## data: diff(random_walk) ## X-squared = 0.65167, df = 1, p-value = 0.4195
r Box.test(diff(TVAR), type = "Box-Pierce")
## ## Box-Pierce test ## ## data: diff(TVAR) ## X-squared = 0.10883, df = 1, p-value = 0.7415
r Box.test(diff(random_walk), type = "Ljung-Box")
## ## Box-Ljung test ## ## data: diff(random_walk) ## X-squared = 0.65263, df = 1, p-value = 0.4192
r Box.test(diff(TVAR), type = "Ljung-Box")
## ## Box-Ljung test ## ## data: diff(TVAR) ## X-squared = 0.10899, df = 1, p-value = 0.7413 Portmanteau tests (Box-Pierce, Ljung–Box) don’t reject the hypothesis of stationarity (with the alternative being serially correlated data).
Hence all aligned with preceding tests
Differenced series for second case have higher variance
plot(density(diff(random_walk)))
density_series_1 <- density(diff(random_walk))
lines(density_series_1$x, dnorm(density_series_1$x),col=3)
plot(density(diff(TVAR)))
density_series_2 <- density(diff(TVAR))
lines(density_series_2$x, dnorm(density_series_2$x, mean(diff(TVAR)), sd(diff(TVAR))),col=4)
But now look at there
par(mfrow=c(2,4))
plot(reg1)
plot(reg2)
Fat tails one the second series, scale-location and residual vs leverage lines not flat, heteroscedastic residuals etc.
See the structure of the regression residuals of the second series. The ones of the first look totally like white noise
par(mfrow=c(2,1))
plot(residuals(reg1),type="l")
plot(residuals(reg2),type="l")
Similarly, no structure on the differences of the first series. But not the same with the second one
par(mfrow=c(2,1))
plot(diff(random_walk),type="l")
plot(diff(TVAR),type="l")
And that is a plot of the squares of the differenced series - an indication of how much “power” each observation contributes to the variance.
“Power” uniformly allocated (over time) on first series but not on the second ones.
par(mfrow=c(2,1))
plot(diff(random_walk)^2,type="l")
plot(diff(TVAR)^2,type="l")
And guess what? Squares of differences auto-correlated on the second series but not on the first
par(mfrow=c(2,1))
acf(diff(random_walk)^2)
acf(diff(TVAR)^2)
Note that the Priestley-Subba Rao stationarity test (https://www.jstor.org/stable/2984336 ) could detect that second series is not stationary. That is because, unlike the former tests, it does NOT ignore the local structure of the series
library("fractal")
## Loading required package: splus2R
## Loading required package: ifultools
##
## Attaching package: 'fractal'
## The following object is masked from 'package:tseries':
##
## surrogate
library("astsa")
stationarity(TVAR)
## 1
## 2
## 3
## 4
## 5
## 6
## N = 2049, nblock = 11, n_block_max = 187, dt = 1.0000
## 7
## 8
## 9
## 10
## 11
## 12
## 13
## 14
## 15
## 16
## 17
## 18
## Priestley-Subba Rao stationarity Test for TVAR
## ----------------------------------------------
## Samples used : 2049
## Samples available : 2046
## Sampling interval : 1
## SDF estimator : Multitaper
## Number of (sine) tapers : 5
## Centered : TRUE
## Recentered : FALSE
## Number of blocks : 11
## Block size : 186
## Number of blocks : 11
## p-value for T : 0
## p-value for I+R : 0.2372223
## p-value for T+I+R : 0
Also rejected using von Sachs and Neumann (2000) wavelet stationariy test that too does NOT ignore local structure
library("locits")
## Loading required package: wavethresh
## Loading required package: MASS
## WaveThresh: R wavelet software, release 4.6.8, installed
## Copyright Guy Nason and others 1993-2016
## Note: nlevels has been renamed to nlevelsWT
## Loading required package: igraph
##
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
##
## decompose, spectrum
## The following object is masked from 'package:base':
##
## union
ans <- hwtos2(diff(TVAR))
## 10 9 8 7 6 5 4 3
plot(ans)
So first series is indeed a Gaussian random walk. But what about the second? It is actually a time varying auto-regressive (1) process with AR coefficient chosen randomly from a uniform distribution around the unit root.
If you do the maths, you won’t find the results surprising. But what what is worth highlighting is that global tests (such as unit roots, portmanteau etc.) do often perform terribly. Also, not only inappropriate for testing stationary but can generate too many false positives