Raices unitarias y Memoria Larga

Los paquetes a subir a la sesión son:

library(mFilter)
library(tseries)
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
library(fUnitRoots)
library(forecast)
library(LongMemoryTS)
library(arfima)
## Loading required package: ltsa
## Note that the arfima package has new defaults starting with
## 1.4-0: type arfimachanges() for a list, as well as some other notes.
## NOTE: some of these are quite important!
## 
## Attaching package: 'arfima'
## The following object is masked from 'package:forecast':
## 
##     arfima
## The following object is masked from 'package:stats':
## 
##     BIC
library(fracdiff)
library(UnitStat)
## Loading required package: lmtest
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
library(urca)
## 
## Attaching package: 'urca'
## The following objects are masked from 'package:fUnitRoots':
## 
##     punitroot, qunitroot, unitrootTable

Vamos a analizar la serie anual de PBI de Argentina entre 1935 y 2014

pbi <- read.csv("C:/Users/jose.louis/OneDrive/Documents/1.-Josi/Econometria/Econometria II/Clases 4 y 5 - Filtros, Raíces unitarias, Memoria larga/pbipc_usd2010.csv")
pbi.ts <- ts(pbi[,2], start=c(1935),freq=1)

Grafico:

plot(log(pbi.ts),xlab="",ylab="(logs)", cex.main=0.8,
     main="Argentina \n PBI por habitante \n Dolares constantes de 2010")

Primera parte: Raices unitarias

Metodo basado en el AIC. Hay que elegir entre deriva (drift), tendencia (trend), ambas, o ninguna (none)

deriva <- ur.df(log(pbi.ts), type = "drift", selectlags = "AIC")
tendencia  <- ur.df(log(pbi.ts), type = "trend", selectlags = "AIC")
summary(deriva)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression drift 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.95415 -0.12753  0.01967  0.14102  0.73533 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  1.31211    0.51005   2.573   0.0121 *
## z.lag.1     -0.15201    0.05944  -2.558   0.0126 *
## z.diff.lag   0.11867    0.11473   1.034   0.3043  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2627 on 75 degrees of freedom
## Multiple R-squared:  0.08187,    Adjusted R-squared:  0.05739 
## F-statistic: 3.344 on 2 and 75 DF,  p-value: 0.04063
## 
## 
## Value of test-statistic is: -2.5576 3.3252 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau2 -3.51 -2.89 -2.58
## phi1  6.70  4.71  3.86

Los valores criticos corresponden a: - tau2 = la hipotesis nula de que gamma=0 - phi1 = la hipotesis combinada de que alfa (la ordenada al origen o deriva) y gamma son iguales a cero. Si se la rechaza, entonces o bien gamma o bien alfa son distintas de cero.

Como el estadistico para tau2 (-2.5576) es menor en valor absoluto a los valores criticos, se rechaza la nula y se concluye que hay raiz unitaria. El valor de phi1 tambien lleva al rechazo de la hipotesis combinada, pues es menor al valor critico. Esto conduce a que alguna de las dos, al menos, es falsa, es decir que alguno de los coeficientes son distintos a cero. Entonces, o bien gamma=0 y hay raiz unitaria o bien alfa=0 y no hay deriva. En este caso, tenemos raiz unitaria, pero no deriva.

El coeficiente z.lag.1 corresponde al del rezago (gamma, en la diapositiva). En este caso, es estadisticamente significativo. El coeficiente z.diff.lag corresponde al del segundo rezago (los beta i para i=2 a i=p en la diapositiva). En este caso, no es significativo.

summary(tendencia)
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.99683 -0.10977  0.02778  0.11723  0.69596 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.931842   0.697891   4.201 7.33e-05 ***
## z.lag.1     -0.370829   0.088359  -4.197 7.44e-05 ***
## tt           0.006293   0.001964   3.204   0.0020 ** 
## z.diff.lag   0.227676   0.113454   2.007   0.0484 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2478 on 74 degrees of freedom
## Multiple R-squared:  0.1937, Adjusted R-squared:  0.1611 
## F-statistic: 5.927 on 3 and 74 DF,  p-value: 0.001112
## 
## 
## Value of test-statistic is: -4.1968 5.9133 8.8086 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -4.04 -3.45 -3.15
## phi2  6.50  4.88  4.16
## phi3  8.73  6.49  5.47

Aqui tenemos tau3 y dos phi: phi2 y phi3 - tau3 es, como antes tau2, que gamma = 0 (raiz unitaria) - phi2 es que gamma=0, que el coeficiente de tendencia deterministica es cero, y que la deriva es cero. - phi3 es que gamma=0 y el coeficiente de tendencia deterministica es cero

Como antes, rechazar phi2 o phi3 lleva a aceptar que algun coeficiente es distinto de cero. El coeficiente tt corresponde al de la tendencia deterministica (beta.t en la tercer ecuacion de la diapositiva)

En este caso, hay NO hay raiz unitaria pues valor del estadistico tau3 > critico al 5% (en valor absoluto)

PERO, si corren el codigo en “interpretacion_urdf_funcion.R”, van a poder usar la funcion interp_urdf(), que se aplica sobre un objeto resultante de la funcion ur.df Esta funcion ayuda a interpretar los resultados. Fuente original: https://gist.github.com/hankroark/968fc28b767f1e43b5a33b151b771bf9

interp_urdf(deriva)
## ========================================================================
## At the 5pct level:
## The model is of type drift
## tau2: The first null hypothesis is not rejected, unit root is present
## phi1: The second null hypothesis is not rejected, unit root is present
##       and there is no drift.
## ========================================================================

Como habiamos visto, hay raiz unitaria sin deriva.

interp_urdf(tendencia)
## ========================================================================
## At the 5pct level:
## The model is of type trend
## tau3: The first null hypothesis is rejected, unit root is not present
## phi3: The second null hypothesis is rejected, unit root is not present
##       and there may or may not be trend
## Warning in interp_urdf(tendencia): Presence of trend is inconclusive.
## phi2: The third null hypothesis is rejected, unit root is not present
##       there may or may not be trend, and there may or may not be drift
## Warning in interp_urdf(tendencia): Presence of trend and drift is inconclusive.
## ========================================================================

Como vemos, no hay raiz unitaria pero no se puede concluir que la deriva y la tendencia sean iguales a cero o no.

Corramos esta funcion sobre las primeras diferencias de la serie:

interp_urdf(ur.df(diff(log(pbi.ts)),type = 'drift', selectlags = 'AIC'))
## ========================================================================
## At the 5pct level:
## The model is of type drift
## tau2: The first null hypothesis is rejected, unit root is not present
## phi1: The second null hypothesis is rejected, unit root is not present
##       and there is drift.
## ========================================================================

En el modelo sobre las primeras diferencias, no hay raiz unitaria y si deriva

interp_urdf(ur.df(diff(log(pbi.ts)),type = 'trend', selectlags = 'AIC'))
## ========================================================================
## At the 5pct level:
## The model is of type trend
## tau3: The first null hypothesis is rejected, unit root is not present
## phi3: The second null hypothesis is rejected, unit root is not present
##       and there may or may not be trend
## Warning in interp_urdf(ur.df(diff(log(pbi.ts)), type = "trend", selectlags =
## "AIC")): Presence of trend is inconclusive.
## phi2: The third null hypothesis is rejected, unit root is not present
##       there may or may not be trend, and there may or may not be drift
## Warning in interp_urdf(ur.df(diff(log(pbi.ts)), type = "trend", selectlags =
## "AIC")): Presence of trend and drift is inconclusive.
## ========================================================================

Como antes, no hay raiz unitaria en el modelo general sobre las primeras diferencias, y no es concluyente el test respecto a la presencia o no de deriva y tendencia.

Según la estrategia de Dolado, Jenkinson y Sosvilla-Rivero hay que partir del modelo con tendencia (R incluye la deriva automáticamente), y elegir muchos lags. Pero la función selecciona automaticamente el numero de lags por medio de AIC. Como el resultado no rechaza la hipotesis nula, hay que correr el modelo con deriva solamente.

El paquete aTSA presenta los tres modelos juntos:

aTSA::adf.test(log(pbi.ts), nlag = 2)
## Augmented Dickey-Fuller Test 
## alternative: stationary 
##  
## Type 1: no drift no trend 
##      lag   ADF p.value
## [1,]   0 0.216   0.703
## [2,]   1 0.174   0.691
## Type 2: with drift no trend 
##      lag   ADF p.value
## [1,]   0 -2.38   0.181
## [2,]   1 -2.56   0.114
## Type 3: with drift and trend 
##      lag   ADF p.value
## [1,]   0 -3.64  0.0352
## [2,]   1 -4.20  0.0100
## ---- 
## Note: in fact, p.value = 0.01 means p.value <= 0.01
summary(ur.pp(log(pbi.ts), model = "constant", lags = "short"))
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.97668 -0.12982  0.02976  0.13801  0.67444 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.17011    0.48701   2.403   0.0187 *  
## y.l1         0.86468    0.05674  15.238   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2611 on 77 degrees of freedom
## Multiple R-squared:  0.751,  Adjusted R-squared:  0.7477 
## F-statistic: 232.2 on 1 and 77 DF,  p-value: < 2.2e-16
## 
## 
## Value of test-statistic, type: Z-alpha  is: -11.0858 
## 
##          aux. Z statistics
## Z-tau-mu             2.443

Otros tests bastante conocidos son:

-Test de Elliot, Rothenberg and Stock

summary(ur.ers(log(pbi.ts), type = "P-test", model = "constant"))
## 
## ############################################### 
## # Elliot, Rothenberg and Stock Unit Root Test # 
## ############################################### 
## 
## Test of type P-test 
## detrending of series with intercept 
## 
## Value of test-statistic is: 3.1595 
## 
## Critical values of P-test are:
##                 1pct 5pct 10pct
## critical values 1.95 3.11  4.17

Como el valor del test es mayor que el valor critico al 5%, rechazamos la hipotesis nula de que la serie es no estacionaria.

summary(ur.kpss(log(pbi.ts), type = "mu", lags = "short"))
## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 3 lags. 
## 
## Value of test-statistic is: 1.4506 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

El paquete UnitStat tiene una sola funcion, UnitStat(), que provee un test para analizar la estabilidad de un proceso estocastico. Es decir, que permite detectar si la serie es estacionaria o no. Evalua hasta cuatro rezagos bajo supuestos con y sin deriva y con y sin tendencia determinística.

UnitStat(pbi.ts)

## $All_Results
##     Lags  Equation type T-test phi1/phi2/phi3 Assumption Outcome     Tau Result
## 1  Lag 1           None  -1.20             na  Assumption passed non-stationary
## 2  Lag 1 Drift equation  -3.14           4.96  Assumption passed     stationary
## 3  Lag 1 Trend equation  -4.68           7.31  Assumption passed     stationary
## 4  Lag 1 Trend equation  -4.68          10.94  Assumption passed     stationary
## 5  Lag 2           None  -0.79             na  Assumption passed non-stationary
## 6  Lag 2 Drift equation  -2.49           3.83  Assumption passed non-stationary
## 7  Lag 2 Trend equation  -3.97           5.41  Assumption passed     stationary
## 8  Lag 2 Trend equation  -3.97            7.2  Assumption passed     stationary
## 9  Lag 3           None  -0.78             na  Assumption passed non-stationary
## 10 Lag 3 Drift equation  -2.64           3.05  Assumption passed non-stationary
## 11 Lag 3 Trend equation  -4.37           5.03  Assumption passed     stationary
## 12 Lag 3 Trend equation  -4.37           6.27  Assumption passed     stationary
## 13 Lag 4           None  -0.60             na  Assumption passed non-stationary
## 14 Lag 4 Drift equation  -2.42           2.41  Assumption passed non-stationary
## 15 Lag 4 Trend equation  -4.32           4.31  Assumption passed     stationary
## 16 Lag 4 Trend equation  -4.32           5.16  Assumption passed     stationary
##                        Phi Result
## 1  No deterministic drift & trend
## 2                   drift present
## 3           drift & trend present
## 4                   trend present
## 5  No deterministic drift & trend
## 6             insignificant drift
## 7           drift & trend present
## 8                   trend present
## 9  No deterministic drift & trend
## 10            insignificant drift
## 11          drift & trend present
## 12            insignificant trend
## 13 No deterministic drift & trend
## 14            insignificant drift
## 15    insignificant drift & trend
## 16            insignificant trend
## 
## $Critical_values
##  tau1  tau2  phi1  tau3  phi2  phi3 
## -1.95 -2.89  4.71 -3.45  4.88  6.49
dev.off()
## null device 
##           1

Segunda parte: Memoria larga

Pintar desde aqui:

## vs.test calculates the statistic of the modified V/S test
##
## x: time series
## q: number of lags included for calculation of covariances
##
## significance level: 0.01, 0.05, 0.1
## critical value: 0.2685, 0.1869, 0.1518
##
## References: Giraitis, Kokoszka und Leipus (2000), Rescaled variance
## and related tests for long memory in volatility and levels
##
vs.test <- function(x, q, alpha)
{
  xbar <- mean(x)
  N <- length(x)
  v <- sum((cumsum(x-xbar))^2) - (sum(cumsum(x-xbar)))^2/N
  kovarianzen <- NULL
  for (i in 1:q)
  {
    kovarianzen <- c(kovarianzen,
                     sum((x[1:(N-i)]-xbar)*(x[(1+i):N]-xbar)))
  }
  if (q > 0)
    s <- sum((x-xbar)^2)/N + sum((1-(1:q)/(q+1))*kovarianzen)*2/N
  else
    s <- sum((x-xbar)^2)/N
  vs <- v/(s*N^2)
  method <- 'V/S Test for Long Memory'
  names(vs) <- 'V/S Statistic'
  names(q) <- 'Bandwidth q'
  structure(list(statistic = vs, parameter = q, method = method,
                 data.name=deparse(substitute(x))), class='htest')
}

Hasta aqui, y correr en la sesión.

Aplicacion a la serie de pbi:

pbi.vs.test <- vs.test(log(pbi.ts),4, alpha=0.05)
pbi.vs.test
## 
##  V/S Test for Long Memory
## 
## data:  log(pbi.ts)
## V/S Statistic = 0.26884, Bandwidth q = 4

Como el V/S es mayor que 0.1869 (valor critico - ver codigo en la funcion), se rechaza la hipotesis nula. La hipotesis nula es que no hay memoria larga. Por lo tanto, se acepta la existencia de memoria larga.

Ante la presencia de memoria larga, usamos ARFIMA:

pbi.arfima <- arfima(log(pbi[,2])) # noten que la funcion arfima()
## Note: only one starting point.  Only one mode can be found -- this is now the default behavior.
## Beginning the fits with 1 starting values.
                                   # no se aplica sobre un objeto
                                   # ts sino sobre un vector
                                   # numerico

El valor de d es:

summary(pbi.arfima)$coef[[1]][1]
## [1] 0.499287

Es decir que la serie tiene memoria larga pero como d<0.5, eventualmente revierte a la tendencia de largo plazo ante choques, pero muy lentamente

Predicciones

pred5.pbi.arfima <- predict(pbi.arfima, n.ahead=5) #n.ahead:cantidad de periodos
                                      # a predecir
pred5.pbi.arfima
## $`Mode 1`
## $`Mode 1`$`Forecasts and SDs`
##                    1        2        3        4        5
## Forecasts   8.964813 8.951220 8.935257 8.920080 8.906180
## Exact SD    0.287325 0.321541 0.339462 0.351497 0.360534
## Limiting SD 0.286878 0.320648 0.338148 0.349780 0.358427
plot(pred5.pbi.arfima)

Comparacion entre predicciones segun ARIMA y ARFIMA

auto.arima(log(pbi.ts),ic="bic")
## Series: log(pbi.ts) 
## ARIMA(0,1,0) 
## 
## sigma^2 = 0.07147:  log likelihood = -7.88
## AIC=17.76   AICc=17.81   BIC=20.13
auto.arima(log(pbi.ts),ic="aic")
## Series: log(pbi.ts) 
## ARIMA(0,1,0) 
## 
## sigma^2 = 0.07147:  log likelihood = -7.88
## AIC=17.76   AICc=17.81   BIC=20.13
arima.pbi <- arima(log(pbi.ts), 
                   order = c(0,1,0),
                   include.mean = FALSE)
aTSA::forecast(arima.pbi, lead=5)
## Forecast for univariate time series: 
##    Lead Forecast   S.E Lower Upper
## 81    1     8.94 0.267  8.42  9.47
## 82    2     8.94 0.378  8.20  9.68
## 83    3     8.94 0.463  8.04  9.85
## 84    4     8.94 0.535  7.90  9.99
## 85    5     8.94 0.598  7.77 10.12
## ------ 
## Note: confidence level = 95 %

arfima.pred <- predict(pbi.arfima, n.ahead=5)[1][[1]]$Forecast
arima.pred <- c(aTSA::forecast(arima.pbi, lead=5)[,2])
## Forecast for univariate time series: 
##    Lead Forecast   S.E Lower Upper
## 81    1     8.94 0.267  8.42  9.47
## 82    2     8.94 0.378  8.20  9.68
## 83    3     8.94 0.463  8.04  9.85
## 84    4     8.94 0.535  7.90  9.99
## 85    5     8.94 0.598  7.77 10.12
## ------ 
## Note: confidence level = 95 %

pbi.arima.pred5 <- ts(c(log(pbi[,2]),
                       as.numeric(arima.pred)),
                       start=1935,freq=1)
pbi.arfima.pred5 <- ts(c(log(pbi[,2]),
                      as.numeric(arfima.pred)),
                      start=1935,freq=1)

plot(pbi.arima.pred5,
     xlim=c(2010,2020),cex.main=0.8,
     main="Predicciones \n PBI por habitante",
     ylab="(en logs)",xlab="")
lines(pbi.arfima.pred5,col="red")
legend(2016,8.5,cex=0.7,bty="n",
       c("ARIMA","ARFIMA"),lty=1,
       col=c("black","red"))