Modeling non-stationary time series

Universidad Privada Boliviana / Universidad del Pacífico (Lima, Perú)

Prof. J. Dávalos (Ph.D.)

The fundamentals of non-stationary series

Most time series that look non-stationary exhibit trend behaviors i.e. economic stocks such as GDP, infrastructure, exports, government expenditure, household consumption, etc.
GDP (US)

Others, do not exhibit a clear trend, but do not look stable at various windows, think about interest rates or certain stock prices.

FED funds rate

A very fundamental non-stationary process is an AR(1) whose coefficient is 1. This is a unit-root process.

\[ Y_t = a + \delta Y_{t-1} + \varepsilon_t \] * Remember that we needed $\delta < 1$ and large history so that:

$Y_t = a \sum_{j=0}^{t-1}\delta^j + \delta^t Y_0 + \sum_{j=0}^{t-1}\delta^j\varepsilon_{t-j}$
- becomes…
$Y_t = a /(1-\delta) + \sum_{j=0}^{t-1}\delta^j\varepsilon_{t-j}$
with $\delta =1$ we do not need large T:
- $Y_t = a t + Y_0 + \sum_{j=0}^{t-1}\varepsilon_{t-j}$

When $\delta = 1$ this is a random walk with drift (constant term) $Y_1 = a + Y_{t-1} + \varepsilon_t$:
$Y_t = a t + Y_0 + \sum_{j=0}^{t-1}\varepsilon_{t-j}$
- $at$ is noted the deterministic trend
- $\sum_{j=0}^{t-1}\varepsilon_{t-j}$ is the stochastic trend
  - Given that $\delta=1$ any past shock never dies. Past shocks exhibit perpetual effects on every future realizations of $Y_t$

Deterministic and stochastic trend, random walk with drift

\[Y_t = a t + Y_0 + \sum_{j=0}^{t-1}\varepsilon_{t-j}\]

$at$ is noted the deterministic trend
$\sum_{j=0}^{t-1}\varepsilon_{t-j}$ is the stochastic trend
Fist-differentiate the series. This leads to $\Delta y_t = a + \varepsilon_t$ which leads to a feasible statistical specification of the new residual (WN). It can be estimated by OLS.
This is a difference stationary process

Deterministic trend and no stochastic trend

Consider the model with deterministic trend and no stochastic trend:
- $Y_t = Y_0 + at + \varepsilon_t$
- A first difference creates a $\Delta \varepsilon_t$ residual, that is not WN: $\Delta Y_t = a + u_t$, where the new residual is $u_t = \Delta\varepsilon_t=\varepsilon_{t}-\varepsilon_{t-1}$. Two consecutive $u_t$ are autocorrelated by definition in a TS process.

So you must detrend the series, i.e. estimate $Y_t$ as a function of a constant: $Y_t = \beta_0 + \beta_1 t + u_t$, calculate the residual $Y_t - \hat\beta_0 - \hat\beta_1 t$ and specify a model for the residual which should be stationary.

The trend can be a polynomial one
This a trend stationary process
Empirically, you are advised to perform both procedures and check which one delivers well behaved WN residuals.
- Detrending signals a trend stationary process
- First differencing signals a unit root proces

Example:

Detrend the series and check its correlogram (ACF, PACF, Portmanteau). If trend stationary (TS), the correlogram must reflect a WN process. If there is an stochastic trend (difference stationary (DS) process), then the correlogram will show evidence of autocorrelation.
First differentiate (FD) the series and check its correlogram (ACF, PACF, Portmanteau). If trend stationary (TS), the residual will show evidence of autocorrelation. If difference stationary (DS), then the residuals will show no autocorrelation (WN).

Nelson and Plosser (1982)

The ACFs show no statistical significance, however we can identify some patterns.

From the level ACFs, Real and nominal GNP and Industrial production seem to be very persistent (ACF(1 2) close 1). We know they exhibit trend like behavior.
- FD these 3 series downsizes the ACFs notably. This is not the case of detrending as the ACFs keep closer to 1. They seem DS. Thus, they are likely to contain both, deterministic and stochastic trends that result of a random-walk with drift model.
Unemployment gets better also by FD. Detrending it does not improve the ACFs so it is likely to be a DS series as well. This is coherent with the visual inspection of any unemployment rate TS (they do not exhibit trend like behavior).

Dickey-Fuller test

It provides a formal test to asses whether a series that seems to exhibit a trend is either:
- $DS$ : $Y_t = a + Y_{t-1} + \varepsilon_t$ This is a Random-Walk (with drift) that exhibits both, deterministic and stochastic trend (in its building block eq.).
- $TS$ : $Y_t = at + Y_0 + \varepsilon_t$ (only deterministic trend)
The baseline equation is: $Y_t = \underbrace{a_0}_{Y_0} + at + \rho Y_{t-1} + \varepsilon_t$, substracting the lagged dependent variable:
- $\Delta Y_t = \underbrace{a_0}_{Y_0} + at + \underbrace{(\rho-1)}_{\gamma} Y_{t-1} + \epsilon_t$, its residual is WN.

We look for testing the $DS$ assumption which implies $\gamma=0$ and $\beta_1 = 0$

$\Delta Y_t = \beta_0+ \underbrace{\beta_1t}_{at} + \underbrace{\gamma}_{\rho -1} Y_{t-1} + \underbrace{\epsilon_t}_{\varepsilon_t}$
if both $H_0:$$\gamma = 0$ and $\beta_1 = 0$ are true, then this implies a random-walk, a DS process
The previous equation is an unrestricted model while a restricted by $H_0$ model is:
- $\Delta Y_t = \beta_0 + \epsilon_t$
To find out if $H_0$ holds (DS process) we can manually test the restricted against the unrestricted model using and F-statistic with Dickey-Fuller (DF) tabulated critical values. This test statistic is denominated $\phi_3$

$\phi_3 = \frac{SSR_{H_0} - SSR_{H_a}/r}{SSR_{H_a}/T-k}$

where T is the length of the TS, r is the number of constraints (2) and k is the number of parameters estimated in the unrestricted ($H_a$) model.

Example:

Note: This naive example IMPOSES/ASUMES that the models’ residuals are WN. An augmented DF test should handle this by adding lags of $\Delta Y_t$:

use http://www.stata-press.com/data/r13/gdp2.dta, clear
des 
tsset tq

** Ha: Unrestricted model
reg d.gdp_ln l.gdp_ln tq 
scalar SSRa   = e(rss)
scalar defree = e(df_r)

** Ho: Restricted model
reg d.gdp_ln 
scalar SSR0   = e(rss)

scalar phi3   = (SSR0-SSRa)*(defree)/(SSRa*2)
scalar di phi3

(Federal Reserve Economic Data, St. Louis Fed)


Contains data from http://www.stata-press.com/data/r13/gdp2.dta
 Observations:           236                  Federal Reserve Economic Data, St. Louis Fed
    Variables:             5                  24 Feb 2013 08:45
--------------------------------------------------------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
--------------------------------------------------------------------------------------------------------------------------------
date            str10   %10s                  fed string date
gdp             double  %10.0g                real gross domestic product (GDP)
daten           float   %td                   numeric (daily) date
tq              float   %tq                   quarterly time variable
gdp_ln          double  %10.0g                natural log of real GDP
--------------------------------------------------------------------------------------------------------------------------------
Sorted by: tq


Time variable: tq, 1952q1 to 2010q4
        Delta: 1 quarter

      Source |       SS           df       MS      Number of obs   =       235
-------------+----------------------------------   F(2, 232)       =      1.91
       Model |  .000336844         2  .000168422   Prob > F        =    0.1500
    Residual |  .020430073       232  .000088061   R-squared       =    0.0162
-------------+----------------------------------   Adj R-squared   =    0.0077
       Total |  .020766917       234  .000088748   Root MSE        =    .00938

------------------------------------------------------------------------------
    D.gdp_ln | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      gdp_ln |
         L1. |  -.0085604   .0143258    -0.60   0.551    -.0367856    .0196648
             |
          tq |   .0000519   .0001153     0.45   0.653    -.0001752     .000279
       _cons |   .0774632   .1144143     0.68   0.499    -.1479607     .302887
------------------------------------------------------------------------------

      Source |       SS           df       MS      Number of obs   =       235
-------------+----------------------------------   F(0, 234)       =      0.00
       Model |           0         0           .   Prob > F        =         .
    Residual |  .020766917       234  .000088748   R-squared       =    0.0000
-------------+----------------------------------   Adj R-squared   =    0.0000
       Total |  .020766917       234  .000088748   Root MSE        =    .00942

------------------------------------------------------------------------------
    D.gdp_ln | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
       _cons |   .0076518   .0006145    12.45   0.000     .0064411    .0088625
------------------------------------------------------------------------------

      phi3 =  1.9125697

DF Critical value for $\phi_3$

Our test statistic does not reject the null, thus the logarithm of the GDP follows a unit-root process i.e it should be differentiated instead of detrended in order to get an stationary process.

link

What it $\phi_3$ rejects the null?

The GDP trend behavior must come from either the unit root or the deterministic trend. So, let’s test whether $H_0: \gamma = 0$.
This may imply that the TS does not have an unit-root, thus being TS. So we must test $H_0$ after using $\phi_3$. If rejected, then there is no unit-root and the trend behavior in the GDP must be explained by a trend-stationary process.
The test statistic is just the t-stat of our $\hat\gamma$ (-0.6). The critical values are provided by DF ($\tau_{\tau}$ in our case Constant + Time trend )

DF Critical value for $\gamma$ given alternative models

Had we rejected the null with $\phi3$ our $\hat \gamma$ t-stat (-0.6) would still be below the 10% critical value of -3.12 (model with constant and trend), thus we would not have rejected the unit-root process either (this is a DS series)
Note that this procedure is adapted for a series with an apparent “trend” that we want to either detrend of first-differentiate, in order to get an stationary process to be scrutinized.

U-root test : DF Tests with no apparent trend

Many economic time series may not exhibit a long-run trend ($_1 = $). Thus, the Dickey-Fuller test can focus on testing for unit-root only. Thus, estimate your DF model with a constant and check the $\tau_{\mu}$ table.

Augmented DF

We assumed that the DF equation was WN. This might not be the case.

General to specific approach. Start by including as many lags ($\Delta Y_{t-k}$) as reasonable and check the last one is significant.
Decrease the lag order if not the last one is not significant.
When a significant lag is achieved, check for WN residuals (Correlogram, Portmanteau test).

Exercise 1

-1 Apply the general to specific approach that replicates our TS vs DS test ($\phi_3$). You are expected to retrieve the following:

use http://www.stata-press.com/data/r13/gdp2.dta, clear
tsset tq

* Augmented DF
* {YOUR GENERAL TO SPECIFIC PROCEDURE HERE}
** Ha: Unrestricted model
reg d.gdp_ln l.gdp_ln tq  l1.D.gdp_ln

predict resid, r
corrgram resid , lags(10)

scalar SSRa = e(rss)
scalar defree = e(df_r)

** Ho: Restricted model
reg d.gdp_ln l1.D.gdp_ln 
scalar SSR0 = e(rss)


scalar phi3 = (SSR0-SSRa)*(defree)/(SSRa*2)
scalar di phi3

(Federal Reserve Economic Data, St. Louis Fed)


Time variable: tq, 1952q1 to 2010q4
        Delta: 1 quarter

      Source |       SS           df       MS      Number of obs   =       234
-------------+----------------------------------   F(3, 230)       =     13.46
       Model |  .003094136         3  .001031379   Prob > F        =    0.0000
    Residual |    .0176297       230  .000076651   R-squared       =    0.1493
-------------+----------------------------------   Adj R-squared   =    0.1382
       Total |  .020723836       233  .000088944   Root MSE        =    .00876

------------------------------------------------------------------------------
    D.gdp_ln | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      gdp_ln |
         L1. |  -.0240497   .0136118    -1.77   0.079    -.0508693      .00277
             |
          tq |   .0001814   .0001097     1.65   0.099    -.0000347    .0003974
             |
      gdp_ln |
         LD. |   .3709295   .0622539     5.96   0.000     .2482686    .4935903
             |
       _cons |   .1979094   .1086162     1.82   0.070    -.0161005    .4119193
------------------------------------------------------------------------------

(2 missing values generated)

                                          -1       0       1 -1       0       1
 LAG       AC       PAC      Q     Prob>Q  [Autocorrelation]  [Partial autocor]
-------------------------------------------------------------------------------
1       -0.0330  -0.0330   .25822  0.6113          |                  |        
2        0.1154   0.1145   3.4278  0.1802          |                  |        
3       -0.0294  -0.0225   3.6344  0.3037          |                  |        
4       -0.0085  -0.0247   3.6516  0.4552          |                  |        
5       -0.1254  -0.1249   7.4438  0.1897         -|                  |        
6        0.0247   0.0155    7.591  0.2696          |                  |        
7       -0.0093   0.0218    7.612  0.3680          |                  |        
8       -0.0535  -0.0629   8.3105  0.4037          |                  |        
9        0.0734   0.0711   9.6338  0.3809          |                  |        
10       0.0497   0.0630   10.242  0.4195          |                  |        

      Source |       SS           df       MS      Number of obs   =       234
-------------+----------------------------------   F(1, 232)       =     34.89
       Model |  .002709314         1  .002709314   Prob > F        =    0.0000
    Residual |  .018014522       232  .000077649   R-squared       =    0.1307
-------------+----------------------------------   Adj R-squared   =    0.1270
       Total |  .020723836       233  .000088944   Root MSE        =    .00881

------------------------------------------------------------------------------
    D.gdp_ln | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      gdp_ln |
         LD. |   .3611968   .0611479     5.91   0.000     .2407206     .481673
             |
       _cons |   .0049162   .0007421     6.62   0.000     .0034541    .0063784
------------------------------------------------------------------------------

      phi3 =  2.5102232

Exercise 2

Given the previous answers, we should agree that there is a unit-root process behind the log of GDP. Calculate its first difference and perform the ADF on it. Test that the FD does not exhibit a unit-root i.e. is it a stationary TS.

Exercise 3

Identify whether Bolivia’s inflation rate is non-stationary: either trend stationary or unit-process. If not a stationary process, detrend of FD the series. Evaluate whether the latter is stationary.

ADF caveats

As in every statistical model, the baseline assumptions must hold.
Some TS may exhibit structural breaks in drifts and slopes (e.g. Bolivia’s inflation at 1985 or US macro indicators during the 1929 crash of the stock market). These will ruin any conclusions based on standard assumptions.
An alternative to not having WN errors in the ADF test is to correct the variance covariance matrix of the estimated parameters by using an alternative to $\sigma^2(X'X)^{-1}$ where $X$ is data matrix containing the regressors of the ADF. The Newey-West variance covariance estimator is implemented by the Phillips-Perron test. Still, the user must provide a guess for the number of lags.