Este es un breve cuaderno de introducción a los modelos VARMAX en Statsmodels. El modelo VARMAX se especifica genéricamente como: \[y_t = \nu + A_1 y_{t-1} + \dots + A_p y_{tp} + B x_t + \epsilon_t + M_1 \epsilon_{t-1} + \dots M_q \epsilon_{tq}\]
donde \(y_t\) es un vector de \(k_{endog} \times 1\).
library(reticulate)
#py_install("pandas")
#py_install("statsmodels")
#py_install("matplotlib")
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
dta = sm.datasets.webuse('lutkepohl2', 'http://www.stata-press.com/data/r12/')
dta.index = dta.qtr
endog = dta.loc['1960-04-01':'1978-10-01', ['dln_inv', 'dln_inc', 'dln_consump']]
dta.head(10)
## inv inc consump ... dln_inc ln_consump dln_consump
## qtr ...
## 1960-01-01 180 451 415 ... NaN 6.028278 NaN
## 1960-04-01 179 465 421 ... 0.030570 6.042633 0.014355
## 1960-07-01 185 485 434 ... 0.042111 6.073044 0.030411
## 1960-10-01 192 493 448 ... 0.016360 6.104793 0.031749
## 1961-01-01 211 509 459 ... 0.031939 6.129050 0.024257
## 1961-04-01 202 520 458 ... 0.021381 6.126869 -0.002181
## 1961-07-01 207 521 479 ... 0.001921 6.171700 0.044831
## 1961-10-01 214 540 487 ... 0.035819 6.188264 0.016563
## 1962-01-01 231 548 497 ... 0.014706 6.208590 0.020326
## 1962-04-01 229 558 510 ... 0.018084 6.234411 0.025821
##
## [10 rows x 10 columns]
La clase VARMAX en Statsmodels permite la estimación de modelos VAR, VMA y VARMA (a través del argumento de orden), opcionalmente con un término constante (a través del argumento de tendencia). También se pueden incluir regresores exógenos (como es habitual en Statsmodels, por el argumento exog), y de esta manera se puede agregar una tendencia temporal. Finalmente, la clase permite el error de medición (a través del argumento error_medición) y permite especificar una matriz de covarianza de innovación diagonal o no estructurada (mediante el argumento tipo_cov_error).
A continuación se muestra un modelo VARX (2) simple en dos variables endógenas y una serie exógena, pero sin término constante. Observe que necesitábamos permitir más iteraciones que las predeterminadas (que es maxiter = 50) para que la estimación de probabilidad converja. Esto no es inusual en los modelos VAR que tienen que estimar un gran número de parámetros, a menudo en un número relativamente pequeño de series de tiempo: este modelo, por ejemplo, estima 27 parámetros de 75 observaciones de 3 variables.
exog = endog['dln_consump']
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(2,0))
## C:\Users\MINEDU~1\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\statsmodels\tsa\base\tsa_model.py:527: ValueWarning: No frequency information was provided, so inferred frequency QS-OCT will be used.
## % freq, ValueWarning)
res = mod.fit(maxiter=1000, disp=False)
print(res.summary())
## Statespace Model Results
## ==================================================================================
## Dep. Variable: ['dln_inv', 'dln_inc'] No. Observations: 75
## Model: VAR(2) Log Likelihood 355.664
## + intercept AIC -685.327
## Date: Tue, 08 Jun 2021 BIC -655.200
## Time: 16:04:38 HQIC -673.298
## Sample: 04-01-1960
## - 10-01-1978
## Covariance Type: opg
## ===================================================================================
## Ljung-Box (L1) (Q): 0.01, 0.03 Jarque-Bera (JB): 15.90, 10.41
## Prob(Q): 0.91, 0.86 Prob(JB): 0.00, 0.01
## Heteroskedasticity (H): 0.44, 0.92 Skew: 0.07, -0.43
## Prob(H) (two-sided): 0.04, 0.83 Kurtosis: 5.25, 4.61
## Results for equation dln_inv
## ==============================================================================
## coef std err z P>|z| [0.025 0.975]
## ------------------------------------------------------------------------------
## intercept 0.0010 0.014 0.073 0.942 -0.027 0.029
## L1.dln_inv -0.2538 0.102 -2.486 0.013 -0.454 -0.054
## L1.dln_inc 0.6004 0.530 1.133 0.257 -0.438 1.639
## L2.dln_inv -0.1298 0.164 -0.792 0.428 -0.451 0.191
## L2.dln_inc 0.5684 0.454 1.251 0.211 -0.322 1.459
## Results for equation dln_inc
## ==============================================================================
## coef std err z P>|z| [0.025 0.975]
## ------------------------------------------------------------------------------
## intercept 0.0183 0.005 3.819 0.000 0.009 0.028
## L1.dln_inv 0.0528 0.044 1.199 0.230 -0.033 0.139
## L1.dln_inc -0.0240 0.154 -0.156 0.876 -0.327 0.279
## L2.dln_inv 0.0444 0.045 0.989 0.323 -0.044 0.133
## L2.dln_inc 0.0576 0.149 0.387 0.699 -0.234 0.349
## Error covariance matrix
## ============================================================================================
## coef std err z P>|z| [0.025 0.975]
## --------------------------------------------------------------------------------------------
## sqrt.var.dln_inv 0.0444 0.003 14.139 0.000 0.038 0.051
## sqrt.cov.dln_inv.dln_inc 0.0017 0.002 0.786 0.432 -0.003 0.006
## sqrt.var.dln_inc 0.0115 0.001 11.762 0.000 0.010 0.013
## ============================================================================================
##
## Warnings:
## [1] Covariance matrix calculated using the outer product of gradients (complex-step).
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(0,2), error_cov_type='diagonal')
## C:\Users\MINEDU~1\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\statsmodels\tsa\base\tsa_model.py:527: ValueWarning: No frequency information was provided, so inferred frequency QS-OCT will be used.
## % freq, ValueWarning)
res = mod.fit(maxiter=1000, disp=False)
print(res.summary())
## Statespace Model Results
## ==================================================================================
## Dep. Variable: ['dln_inv', 'dln_inc'] No. Observations: 75
## Model: VMA(2) Log Likelihood 353.887
## + intercept AIC -683.775
## Date: Tue, 08 Jun 2021 BIC -655.965
## Time: 16:04:44 HQIC -672.671
## Sample: 04-01-1960
## - 10-01-1978
## Covariance Type: opg
## ===================================================================================
## Ljung-Box (L1) (Q): 0.01, 0.06 Jarque-Bera (JB): 12.55, 13.57
## Prob(Q): 0.93, 0.81 Prob(JB): 0.00, 0.00
## Heteroskedasticity (H): 0.44, 0.81 Skew: 0.05, -0.48
## Prob(H) (two-sided): 0.04, 0.60 Kurtosis: 5.00, 4.85
## Results for equation dln_inv
## =================================================================================
## coef std err z P>|z| [0.025 0.975]
## ---------------------------------------------------------------------------------
## intercept 0.0182 0.005 3.807 0.000 0.009 0.028
## L1.e(dln_inv) -0.2614 0.106 -2.475 0.013 -0.468 -0.054
## L1.e(dln_inc) 0.5235 0.631 0.830 0.406 -0.713 1.760
## L2.e(dln_inv) 0.0321 0.149 0.216 0.829 -0.259 0.323
## L2.e(dln_inc) 0.1881 0.478 0.394 0.694 -0.748 1.125
## Results for equation dln_inc
## =================================================================================
## coef std err z P>|z| [0.025 0.975]
## ---------------------------------------------------------------------------------
## intercept 0.0207 0.002 13.065 0.000 0.018 0.024
## L1.e(dln_inv) 0.0477 0.042 1.145 0.252 -0.034 0.129
## L1.e(dln_inc) -0.0757 0.140 -0.542 0.588 -0.349 0.198
## L2.e(dln_inv) 0.0174 0.042 0.409 0.682 -0.066 0.101
## L2.e(dln_inc) 0.1255 0.153 0.823 0.411 -0.174 0.425
## Error covariance matrix
## ==================================================================================
## coef std err z P>|z| [0.025 0.975]
## ----------------------------------------------------------------------------------
## sigma2.dln_inv 0.0020 0.000 7.354 0.000 0.001 0.003
## sigma2.dln_inc 0.0001 2.33e-05 5.836 0.000 9.02e-05 0.000
## ==================================================================================
##
## Warnings:
## [1] Covariance matrix calculated using the outer product of gradients (complex-step).
Precaución: especificaciones VARMA (p, q)
Aunque el modelo permite estimar especificaciones VARMA (p, q), estos modelos no se identifican sin restricciones adicionales en las matrices de representación, que no están incorporadas. Por esta razón, se recomienda que el usuario proceda con error (y de hecho se emite una advertencia cuando se especifican estos modelos). No obstante, en algunas circunstancias pueden proporcionar información útil.
mod = sm.tsa.VARMAX(endog[['dln_inv', 'dln_inc']], order=(1,1))
## C:\Users\MINEDU~1\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\statsmodels\tsa\statespace\varmax.py:163: EstimationWarning: Estimation of VARMA(p,q) models is not generically robust, due especially to identification issues.
## EstimationWarning)
## C:\Users\MINEDU~1\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\statsmodels\tsa\base\tsa_model.py:527: ValueWarning: No frequency information was provided, so inferred frequency QS-OCT will be used.
## % freq, ValueWarning)
res = mod.fit(maxiter=1000, disp=False)
print(res.summary())
## Statespace Model Results
## ==================================================================================
## Dep. Variable: ['dln_inv', 'dln_inc'] No. Observations: 75
## Model: VARMA(1,1) Log Likelihood 354.293
## + intercept AIC -682.586
## Date: Tue, 08 Jun 2021 BIC -652.458
## Time: 16:04:49 HQIC -670.556
## Sample: 04-01-1960
## - 10-01-1978
## Covariance Type: opg
## ===================================================================================
## Ljung-Box (L1) (Q): 0.00, 0.06 Jarque-Bera (JB): 11.39, 14.04
## Prob(Q): 0.99, 0.80 Prob(JB): 0.00, 0.00
## Heteroskedasticity (H): 0.43, 0.91 Skew: 0.02, -0.45
## Prob(H) (two-sided): 0.04, 0.80 Kurtosis: 4.91, 4.92
## Results for equation dln_inv
## =================================================================================
## coef std err z P>|z| [0.025 0.975]
## ---------------------------------------------------------------------------------
## intercept 0.0101 0.063 0.160 0.873 -0.114 0.134
## L1.dln_inv -0.0057 0.685 -0.008 0.993 -1.348 1.336
## L1.dln_inc 0.3989 2.683 0.149 0.882 -4.860 5.658
## L1.e(dln_inv) -0.2432 0.694 -0.350 0.726 -1.604 1.118
## L1.e(dln_inc) 0.1183 2.939 0.040 0.968 -5.643 5.879
## Results for equation dln_inc
## =================================================================================
## coef std err z P>|z| [0.025 0.975]
## ---------------------------------------------------------------------------------
## intercept 0.0164 0.026 0.626 0.531 -0.035 0.068
## L1.dln_inv -0.0309 0.273 -0.113 0.910 -0.567 0.505
## L1.dln_inc 0.2376 1.064 0.223 0.823 -1.847 2.323
## L1.e(dln_inv) 0.0872 0.280 0.312 0.755 -0.462 0.636
## L1.e(dln_inc) -0.2380 1.100 -0.216 0.829 -2.395 1.919
## Error covariance matrix
## ============================================================================================
## coef std err z P>|z| [0.025 0.975]
## --------------------------------------------------------------------------------------------
## sqrt.var.dln_inv 0.0449 0.003 14.511 0.000 0.039 0.051
## sqrt.cov.dln_inv.dln_inc 0.0017 0.003 0.660 0.509 -0.003 0.007
## sqrt.var.dln_inc 0.0116 0.001 11.799 0.000 0.010 0.013
## ============================================================================================
##
## Warnings:
## [1] Covariance matrix calculated using the outer product of gradients (complex-step).