webuse wpi1, clear
tsline wpi
* Inflation rategen pi = d.ln_wpi
tsline pi
mean model
* The
* ARMA(?,?)
{
I
* Box-Jenkins stage corrgram pi ,lags(10)
ac pi
* Correlograms suggest ARMA(1,1) AR(2) AR(3) AR(4)
arima pi , ar(1) ma(1)
estat ic
predict res11 , r
* -756 -745
arima pi , ar(1/4)
estat ic
* -756 -739predict res4 , r
arima pi , ar(1/3)
estat ic
*-753 -739
arima pi , ar(1/2)
estat ic
* -753 -742predict res2 , r
}
* Stage II
* IC suggest ARMA(1,1) and AR(4)
* Stage III
* WN testsforeach name in res4 res11{
`name'
* 1 Visualization tsline `name' , name(line`name', replace)
test (Portmanteau) `name'
* 2 Correlograms and Q-corrgram `name' , lags(10)
ac `name' , name(ac`name', replace)
`name'
* 3 Normality sfrancia `name'
hist(`name' ), normal name(hist`name', replace)
}
do not pass the normality test
* The ARMA(1,1) and AR(4) residuals
* Yet, both have well behaved correlograms
* The high volatility during the 70s may cause these residuals be normal (they are fat tailed, see the histogram)
* no to
* The ARMA(1,1) has better IC so it is the best one.
* GARCH(?,?) modeling
* Box - Jenkins
I
* Stage gen z = res11^2 // squared residuals
corrgram z, lags(10)
ac z
* The first AC =0.at lags 2, 4, 6
* high PACF at lag 2, 4
* high ACF model could be an ARCH(4) (several other were implemented)
* The
J Stage II
* B-model
* ARCH qui:arima z , ar(1 2 3 4) // all lags
estat ic
predict wn_z , r
predict sigma2_t , xb // predicted variance of the error
error of the ARCH equation WN
* Stage III - is the
* 1 Visualization tsline wn_z , name(line_wnz, replace)
* This has some issues during the 70s
test (Portmanteau) z
* 2 Correlograms and Q-corrgram wn_z , lags(10)
all do not reject the null
* Portmanteau tests
ac wn_z , name(ac_wnz, replace)
do not reject the null
* Individual autocorrelations also white noise
* Error are
* 3 Normality z
sfrancia wn_z
hist(wn_z ), normal name(histwnz, replace)
of the mean error (ARMA(1,1))
* Revisiting the normality by standarizing its WN error
* gen Standarized = sqrt(z/sigma2_t)
sfrancia Standarized
Modeling stationary time series
Universidad Privada Boliviana / Universidad del Pacífico (Lima, Perú)
Modeling \(\sigma_t\) (volatility)
Introduction
Biographical note
“I took a sabbatical at LSE in 1979. Lindsey was then 2 and we rented a lovely row house in Hampstead with a back garden and a study in the back of the first floor. It was there that a new idea came.
I was interested in Milton Friedman‘s conjecture that inflation uncertainty was a central cause of business cycles. Investors who did not know what prices and wages would be in the future might invest less. To test this, a time series model was needed with variances that could change over time. There were two tools that came together to solve this problem. I had done a lot of work with the Kalman Filter and recognized that a one step predictive density would be sufficient to define a likelihood function. The second tool was a test. Clive had recently proposed a test for bilinear time series models. He came by my computer one day before I left and suggested I square the residuals and then fit an autoregression. To my amazement, it was quite significant. I suspected that this test was the optimal Lagrange Multiplier test for some new type of model, but not the bilinear model. I was later to discover that it is indeed the optimal test for ARCH and it is so called today.
Lunch and tea at the LSE were very stimulating times for me. Each day I would get a little further on this new model and would talk with Sargan or Durbin or Hendry or Harvey about its properties and my proofs. David Hendry eventually named it AutoRegressive Conditional Heteroskedasticity and offered to have Frank Srba program it. We applied it to UK inflation data and the ARCH model was launched…”
Finance
Modeling uncertainty by endogeneizing \(\sigma_t\) quickly found applications in investment and portfolio management technologies.
Now it was possible to introduce further uncertainty in asset pricing models (Black and Scholes pricing among others), enhanced portfolio optimization (Markov models), market risk estimation (parametric Value at Risk, etc.)
Although to be fair, stochastic volatility models were available much earlier, they were not as parsimonious in terms of estimation and interpretation.
In economics
Countless macro models relied on homoskedasticity assumptions
Specifically, ARMA models impose constant variance on the WN
Modeling \(\sigma_t\) aka heteroskedasticity. How ?
“…He [Clive Granger] came by my computer one day before I left and suggested I square the residuals and then fit an autoregression. To my amazement, it was quite significant…” R. Engle III.
Remember the linear regression is a model for the conditional expectation: \(E(Y_t| . ) = a + \sum_{j=1}^p \delta_j Y_{t-j}\)
Given a zero mean error term noted \(u_t\) whose volatility is suspected to be time varying means that:
\(V(u_t| . ) = E(u_t^2| . ) = \sigma_t^2\)
By definition, this variance is allowed to be time varying, however, it must be covariance stationary.
The ARCH model
An autoregression AR(p) on the latter \(u_t\) means:
- \(E(u_t^2| {u_{t-j}}_{j=1,...,p} ) = \sigma_t^2 = \alpha_0 + \sum_{j=1}^{p} \alpha_j u_{t-j}^2\)
So the estimating equation is obtained by dropping the expectation operator and adding a residual on the variance expectation:
- \(u_t^2 = \sigma_t^2 + \epsilon_t\)
- \(u_t^2 \, = \alpha_0 + \sum_{j=1}^{p} \alpha_j u_{t-j}^2 + \epsilon_t\)
- where \(\epsilon_t\) is a WN process.
“…David Hendry eventually named it AutoRegressive Conditional Heteroskedasticity and offered to have Frank Srba program it…”
The GARCH model
So that was the ARCH model… an AR(p) on the squared residuals.
“…Tim [Bollersllev] took the ARCH model, added a moving average and created GARCH. The GARCH model is an infinite order ARCH model with a geometrically declining set of weights…”:
- \(u_t^2 \, = \alpha_0 + \sum_{j=1}^{p} \alpha_j u_{t-j}^2 + \sum_{j=1}^{q} \beta_j \epsilon_{t-j}^2 + \epsilon_t\)
Generalized ARCH model
This is an ARMA on the squared residuals, where \(\epsilon_t\) must be WN as well. Hence, we could apply the full Box-Jenkins technology to specify a volatility model.
Covariance stationarity of \(u_t^2\) must hold: \(\sum_j |\alpha_j| <1\)
Implementation
- A well specified model for the mean (conditional expectation) must exist i.e. its error term must be WN
- A manual (less-efficient) two-step estimation may generate the residuals from \(E(Y_t|.)\) model. This is the first step.
- In a second step, one estimates an ARMA(p,q) based on the squared residuals (Box-Jenkins, etc.)
- Most software employ a one-step full information maximum likelihood estimator. It is more efficient and does not neglect the randomness of the first step. We do not need to stick to normality since alternative ML distributions exist.
Two-step inefficient implementation
The following stata code specifies a model for the U.S. inflation. The Box-Jenkins approach suggests an ARMA(1,1).
A battery of tests on the ARMA(1,1) residuals show that they are WN but are not normal.
The correlogram on the squared residuals unveil the autocorrelations that characterize an ARMA process i.e. GARCH process. However, the order of the process is hard to identify.
A parsimonious ARCH(4) is retained, its residual (wn_z) is white noise although it is not normally distributed.
Revisiting (the correlogram) and normality test on \(\sqrt{\varepsilon_t^2/\sigma_t^2}\) which should have variance = 1 reject the null. The standarization does not resolve the non-normality issue.
This may be the consequence of a structural shock in the WN distribution during the 1970’s which goes beyond the variance.
Stata code
One-step FIML implementation
This is computationally intensive but guarantees proper statistical inference as it integrates both equations multivariate process.
Since variances need to be positive, the estimation algorithms introduce natural constraints on the GARCH, ARCH parameters
Normality should not be a concern as alternative distributions exist.
Implementation in STATA
Stata cannot handle MA(q) terms in the mean equation, so to replicate our trial with US inflation we will use the second best AR(4) process which had non-normal WN errors.
We also implement a ARCH-LM test on the OLS regression.
webuse wpi1, clear
gen pi = d.ln_wpi
* Mean equation:reg pi l.pi l2.pi l3.pi l4.pi
test for arch effects (from the OLS estimator):
* LM estat archlm, lags(5)
arch estimation by ML
* arch pi l.pi l2.pi l3.pi l4.pi, arch(1 2 3 4)
(1 missing value generated)
Source | SS df MS Number of obs = 119
-------------+---------------------------------- F(4, 114) = 22.97
Model | .011057246 4 .002764311 Prob > F = 0.0000
Residual | .013716608 114 .000120321 R-squared = 0.4463
-------------+---------------------------------- Adj R-squared = 0.4269
Total | .024773854 118 .000209948 Root MSE = .01097
------------------------------------------------------------------------------
pi | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
pi |
L1. | .4003877 .0927466 4.32 0.000 .2166574 .5841181
L2. | .1594063 .1000446 1.59 0.114 -.0387812 .3575938
L3. | .029242 .1010879 0.29 0.773 -.1710124 .2294963
L4. | .2081134 .0935064 2.23 0.028 .022878 .3933487
|
_cons | .0025228 .0013622 1.85 0.067 -.0001756 .0052212
------------------------------------------------------------------------------
LM test for autoregressive conditional heteroskedasticity (ARCH)
---------------------------------------------------------------------------
lags(p) | chi2 df Prob > chi2
-------------+-------------------------------------------------------------
5 | 31.860 5 0.0000
---------------------------------------------------------------------------
H0: no ARCH effects vs. H1: ARCH(p) disturbance
(setting optimization to BHHH)
Iteration 0: log likelihood = 384.91432
Iteration 1: log likelihood = 387.67617
Iteration 2: log likelihood = 389.94168
Iteration 3: log likelihood = 390.56305
Iteration 4: log likelihood = 390.62293
(switching optimization to BFGS)
Iteration 5: log likelihood = 390.97346
Iteration 6: log likelihood = 390.99555
Iteration 7: log likelihood = 391.00377
Iteration 8: log likelihood = 391.0059
Iteration 9: log likelihood = 391.00686
Iteration 10: log likelihood = 391.00697
Iteration 11: log likelihood = 391.00698
ARCH family regression
Sample: 1961q2 thru 1990q4 Number of obs = 119
Wald chi2(4) = 58.66
Log likelihood = 391.007 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| OPG
pi | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
pi |
pi |
L1. | .4089091 .0999144 4.09 0.000 .2130806 .6047377
L2. | .1449355 .1099207 1.32 0.187 -.0705051 .3603762
L3. | .0734836 .1021173 0.72 0.472 -.1266627 .2736298
L4. | .0961683 .0982937 0.98 0.328 -.0964837 .2888203
|
_cons | .0016799 .0011531 1.46 0.145 -.0005801 .0039399
-------------+----------------------------------------------------------------
ARCH |
arch |
L1. | .168183 .1660098 1.01 0.311 -.1571901 .4935562
L2. | .2066973 .1582062 1.31 0.191 -.1033811 .5167757
L3. | .0369231 .1169908 0.32 0.752 -.1923747 .2662209
L4. | .4986439 .1531968 3.25 0.001 .1983837 .798904
|
_cons | .0000238 .0000132 1.81 0.070 -1.98e-06 .0000496
------------------------------------------------------------------------------
From the estimated coefficients, both AR process are stationary (sum lower than 1)
Let’s show the predicted residuals along with a CI calculated from \(+- 1.96\sigma_t\):
predict ht , variance
predict residual , r
gen low = - 1.65*sqrt(ht)
gen high = + 1.65*sqrt(ht)
tsline residual low high
(5 missing values generated)
- Volatility is a measure of inflation uncertainty which was particularly high during the mid-70’s in the US.
corrgram residual, lags(6)
gen errorstd = sqrt(residual^2/ht)
corrgram errorstd, lags(6)
sfrancia errorstd
-1 0 1 -1 0 1
LAG AC PAC Q Prob>Q [Autocorrelation] [Partial autocor]
-------------------------------------------------------------------------------
1 0.0029 0.0031 .00103 0.9745 | |
2 0.0238 0.0251 .07098 0.9651 | |
3 -0.0117 -0.0123 .08812 0.9932 | |
4 0.1654 0.1742 3.5152 0.4756 |- |-
5 -0.0192 -0.0222 3.5618 0.6141 | |
6 0.1304 0.1350 5.7291 0.4542 |- |-
(5 missing values generated)
-1 0 1 -1 0 1
LAG AC PAC Q Prob>Q [Autocorrelation] [Partial autocor]
-------------------------------------------------------------------------------
1 -0.0075 -0.0079 .00695 0.9336 | |
2 -0.0168 -0.0174 .04183 0.9793 | |
3 0.0095 0.0097 .05311 0.9968 | |
4 -0.0356 -0.0373 .21208 0.9948 | |
5 0.1067 0.1135 1.6493 0.8952 | |
6 -0.1293 -0.1417 3.7803 0.7064 -| -|
Shapiro–Francia W' test for normal data
Variable | Obs W' V' z Prob>z
-------------+-----------------------------------------------------
errorstd | 119 0.90002 10.507 4.702 0.00001
Exercise
Pick Bolivia’s inflation rate, the longest quarterly time series available and identify whether its volatility may be modelled by an GARCH or ARCH process.
- Specify the best AR(p) process for the mean using B-J approach (summarize your diagnostics)
- Analyze the correlogram of the squared residuals and run the ARCH-LM test
- Specify your ARCH - GARCH model, and perform diagnostics on your standarized residuals
- What are the periods of high inflation uncertainty, other than the mid-80’s?