use http://www.stata-press.com/data/r13/gdp2.dta
des
li * in 1/4Time series econometrics
Session 1 - Introduction and stationarity
Motivation
- In many fields, observational units are observed across time
- Fundamental applications in climate modeling, physics, medicine, biostatistics, demography, etc. link
What is longest time series you can imagine ?
and it is actually white noise, at least the few decades of gathered data !
\[E (u_t) = 0 \quad , \quad V(u_t) = k \quad , \quad Cov( u_t u_{t-j}) = 0 \quad \forall t\]
The longest available ?
Perspectives
Physics
The most successful in providing good approximations to “laws of nature”, thus, the quantitative modeling of things is often “structural” (deterministic). However, exploratory diagnostics can be performed using agnostic TS technology
Time series
What do we have in mind?
What are the main patterns behind these series? … components
Why do we need to understand them? … main applications
Main applications 1/2
- Compact description of our data
- i.e., classical decomposition \[X_t = T_t + S_t + Y_t\]
- Interpretation
- i.e., patterns well explained by fundamentals
- Backtesting, Nowcasting, forecasting
- i.e. test strategies using the past, estimate covid outbreaks (now) and prediction
- And, there is of course, the Lucas critique…
Main applications 1/2
Nowcasting unemployment:
Unemployment explained by ‘instantaneous’ google search activity
Main applications 2/2
- Control
- i.e. follow up economic policy goals (monetary, fiscal)
- Inference about the theoretical (population) model
- i.e. did a given policy intervention had an effect on the objective variable?
- Simulation analysis
- i.e. Risk assessment by simulating the distribution of an outcome of interest. The probability of a riot, outbreak, natural disaster, etc.
Main applications 2/2
Simulating future climate shocks (El Nino events)
Time Series Components
- Defining and identifying the key components of any time series is the first step to its understanding.
- This is essential to statistical inference. One must control for the non independent and identically distributed nature of any time series
Recall the classical linear model of a population (cross sectional data):
\[ Y = X\beta + u\] and its OLS estimator, was consistent (BLUE, etc.) :
\[ plim_{n \rightarrow \infty} \hat \beta = \beta + \underbrace{plim_{n \rightarrow \infty} (X'X)^{-1}Y'u}_0\]
This holds true if and only if \(X\beta + u\) is Y’s data generating process (DGP). However, time series data may be ‘chaotic’ with few or none theoretically sound DGP.
We need an alternative and general DGP perspective, a non-structural one.
Think about the throw of a dice. Its behavior can be predicted using the physics of the dice, but this is analytically and computationally challenging. This is a deterministic or structural approach. It is about identyfying math laws of nature given by ‘God’.
A more parsimonious approach is to exploit its probabilistic behavior. Given a sequence of 6 past throws of a dice \(\{X_t\}_{t=1,...,6}\), what is the most likely outcome for the seventh-throw (\(X_7\))?
- This is a stochastic approach, the one proposed by the standard time series literature
- In physics, you can think of quantum-theory as the analog, since sub-atomic particle states are not modeled deterministically, but stochastically.
Change of paradigm, not everyone agreed with it… “God does not play dice” A. Einstein.
Building blocks: TS decomposition
A first approach requires identifying the core components of any economic time series.
- Trend
- Cycle
- Seasonality
- Noise
Alternative approaches borrow from physics and decompose TS as a frequency (Fourier and wavelets transforms).
Trend
tsset tq
tsline gdp
qui: graph export "gdp.svg", replaceuse http://www.stata-press.com/data/r13/unrate
des
li * in 1/4Trend and cycles
use http://www.stata-press.com/data/r13/unrate
tsset tm // time series setup
tsline unrate
qui: graph export "unrate.svg", replaceuse http://www.stata-press.com/data/r13/mumps.dta
des
li * in 1/4Seasonality
tsset tm
tsline mumps
qui: graph export "mumps.svg", replaceuse http://www.stata-press.com/data/r13/air2.dta
des
li * in 1/4Seasonality and trend
tsset t
tsline air
qui: graph export "air.svg", replaceRemaining information may be considered noise… yet this may contain lots of information:
use http://www.stata-press.com/data/r13/stocks.dta
des
li * in 1/4tsset t
tsline toyota nissan honda
qui: graph export "returns.svg", replaceSummary
- Most (macro) economic time series are build from trends and noise. This determines the scope of introductory time series econometrics (much more specific than the general statistical view).
- From it, natural extensions to cycles and seasonal components can be introduced
TS definition
- A TS is an ordered set i.e. a sequence of random variables : \(X_t = \{X_1, X_2, ...,X_t\}\)
- The latter is is random vector, each element is random or stochastic
- A realization of our vector writes: \[x_t = \{x_1, x_2, ...,x_t\}\] Each element is known. This is §that we observe.
A TS model
- A TS model specifies the joint probability distribution of the sequence \(\{X_t\}\). This may be a reference distribution, not necessarily the true one.
- Example.
- Toyota’s stock returns at any \(t\) may be assumed to follow a Normal distribution: \[ R_j \sim N(\mu, \sigma^2) = \Phi(R_j)\]
- Example.
Thus, if \(R_j\) and any lag \(R_{j-k}\) are uncorrelated (which means independence of 2 normal distributed variables), then, the random time series vector \(R_t\) has a multivariate normal distribution:
\[\Phi(X_1, X_2, ...,X_t) = \prod_{j=1}^t \Phi(X_j)\]
White noise
The latter is the simplest case. It actually corresponds to how noise without information may be represented. This is denoted as white noise. Formally:
A given random point in time \(X_t\) has 0 mean, constant variance and it’s independent from \(X_{j , j\neq t}\). \[X_t \sim WN(0,\sigma^2)\]
Under this setup, \(X_t\) can follow any distribution as long as the mean, variance and independendence assumptions hold for every \(t\). Thus, \(X_t\) is iid.
Simulating white noise
- Example: \(X_j \sim N(0,2)\) and \(Y_j +5 \sim P(\lambda = 5)\), both iid.
# R code
t = 100
X = rnorm(t,0,2) # From a normal distribution
Y = rpois(t,5) - 5 # from a Poisson distribution- Normally distributed WN
# R code
ts.plot(X)summary(X) Min. 1st Qu. Median Mean 3rd Qu. Max.
-4.19073 -1.34540 -0.26374 -0.07843 1.29029 5.07545
- Poisson distributed WN, skewed distribution
# R code
ts.plot(Y)summary(Y) Min. 1st Qu. Median Mean 3rd Qu. Max.
-4.00 -2.00 0.00 0.25 2.00 6.00
Exercises
- Show that \(Y_j\) is WN.
- Consider the cumulated historical mean \(Z_t = \sum_{j=1}^t X_j/t\), where \(X_j \sim WN(0, \sigma^2)\)
Is \(\{Z_t\}\) WN ?
- Consider the cumulated historical mean \(Z_t = \sum_{j=1}^t X_j/t\), where \(X_j \sim WN(0, \sigma^2)\)
- Consider the cummulated historical mean \(\tilde{Z}_t = k \sum_{j=1}^t X_j/t\), where \(X_j \sim WN(0, \sigma^2)\). Find \(k\) such that \(\tilde{Z}_t\) has constant variance.
- From your previous answer, now assume that \(X_j\) follows a normal distribution : \(N(0, \sigma^2)\), what is the distribution of \(\tilde{Z}_t\) given \(Z_t\) distribution ?
Solution
- \(Y_j + 5 \sim P(\lambda=5)\)
Thus, \(E(Y_j + 5) = 5\), \(E(Y_j) = 0\). \(V(Y_j+5) = \lambda = 5\) This is a property of the Poisson distribution, its expected value equals its variance. \(V(Y_j) = 5\)
Given that \(Y_j\) is generated as a sequence of independent random variables and has 0 mean and constant variance, it is WN. \(Y_j \sim WN(0,5)\)
- \(E(Z_t) = 0\) \(V(Z_t) = \sigma^2/t\) This is not constant as it reduces with \(t\), this can not be WN
Stationarity
Having the same cumulated distribution function (cdf) irrespective of the time location , $t $, leads to very convenient analytical properties. The exercises above, are meant to motivate this under the particular WN case.
Let \(F(X_1, X_2, ...,X_t)\) be the cdf of a time series noted \(\{X_t\}\), hence, the series is said to be strictly stationary if the cdf of two comparable windows are the same:
\[F(X_1, X_2, ...,X_t) = F(X_{1+k}, X_{2+k}, ...,X_{t+k}) \quad ; \quad k \neq 0\]
- The moment generating function \(m(X^k)\) for \(k = 1,2,...\) may be used to fully describe a CDF. Empirically, knowing first, second, third… allows to verify if strict stationarity holds. This is obviously impractical.
Stationarity - weak stationarity
Instead we can bound our analytical investigation to first and second order moments only i.e. mean and covariances (variance and covariance with other time periods). This implies 3 conditions.
Having identical first and second moments over two arbitrary shifted windows makes a series weakly stationary. This brings convenience the analytical manipulations. Formally, weak stationarity writes:
\(E(X_1, X_2, ...,X_t) = E(X_{1+k}, X_{2+k}, ...,X_{t+k})\)
or \(E(X_t) = E(X_{t+k}) \quad k \neq 0\)
\(V(X_1, X_2, ...,X_t) = V(X_{1+k}, X_{2+k}, ...,X_{t+k})\)
or \(\quad V(X_t) = V(X_{t+k}) ; \quad k \neq 0\)
Covariance with other time periods -> Autocovariance and autocorrelation
\(Cov(X_{t-k},X_{t}) = Cov(X_{j-k},X_{j})\), in compact notation, let’s denote $(t, t-k) $ as the autocovariance in the left-hand side.
\(\gamma(k) = E(X_t - \mu)(X_{t-k} - \mu)\) similarly in terms of autocorrelation \(\rho_{k}\):
\(\rho_{k} =\gamma(k)/\gamma(0)\)
The time-dependence (autocovariance) of k-th order must be identical, irrespective of the window (given by \(t\) and \(j\)).
Weak stationarity - in a nutshell
- \(E(X_t) = \mu\) it does not depend on \(t\) (first moment)
- \(V(X_t) = \sigma^2\) it does not depend on \(t\) (second moment)
- \(\gamma(t, t-k) = \gamma(j,j-k)\) it does not depend on the time window (\(t\) or \(j\)) either (second moment), although it may depend on the size of lag \(k\). This relates to the memory of the process.
Exercises
- From the definition of the pearson correlation coefficient (population not sample), show that:
\(\rho_k =\gamma(k)/\gamma(0)\)
- Let \(X_t\) be WN, and \(Z_t = \theta Z_{t-1} + X_t\), where \(\theta <1\). Is \(Z_t\) WN ?
- Is \(Z_t\) (weakly) stationary ?
Social Sciences
Much less successful in explaining social phenomena, yet, still useful. Structural models (social laws) are well complemented by agnostic TS techniques. We even employ semi-structural models.