- What is time series
- Stationarity
- ACF/correlogram
- unit roots
- ADF test
- trend vs difference stationarity
- Random Walks
November 29, 2016
We're not getting random data which means we can't make valid inferences using the Classical Linear Regression Assumptions.
We're looking at data that is generated from some process, so we want to model that process to squeeze out systematic variations in the data.
Remember our overall task: use variation in \(x\) to understand variation in \(y\). OLS regression on time series data will lead us astray if we don't account for certain possibilities.
The big problem we're facing in TS!
Autocorrelation means our statistical inferences won't be valid until we appropriately tweak our models.
Remember, our goal is to try to understand variation in Y by looking at variation in X. Normally if we see X and Y increasing together, that means X and Y are highly correlated. But now that comovement might be simply because of a time trend.
Removing this regularity in the data means we can ask about variation in X and Y around the trend.
Another regularity is seasonal variation.
If a series is stationary:
A process is stationary if the mean, variance, and autocorrelation don't change over time.
\[E(x_t) = \mu_x\] \[var(x_t) = \sigma_x^2\] \(cov(x_t,x_{t-j}) = f(j)\), but does not depend on \(t\).
Data with a strong trend may be trend stationary but won't be stationary without (at least) accounting for that trend.
lag(x,1:something)
, not just lag(x,1)
.
Extra credit: Based on a series of coin tosses I will give you 1 extra credit point for each H
, but -1 point for each T
.
Questions:
rbinom(100,1,0.5) %>% ifelse(.,1,-1) %>% cumsum()
[1] 1 2 1 2 3 2 1 0 -1 0 1 2 1 0 -1 0 -1 0 -1 0 1 0 1 [24] 0 1 2 3 2 1 2 3 2 3 2 1 0 -1 0 1 0 -1 0 1 2 3 2 [47] 3 2 3 4 3 2 1 2 3 4 5 4 3 2 3 2 1 2 3 4 5 4 5 [70] 6 7 6 7 8 9 8 9 10 9 10 11 12 11 12 11 12 13 12 11 10 11 10 [93] 9 10 11 12 13 12 11 12
[1] 1 2 1 2 1 2 3 2 3 2 3 2 3 2 3 4 5 6 5 4 5 4 3 [24] 4 3 2 3 2 1 0 1 0 1 0 -1 0 1 2 1 0 1 2 3 2 3 2 [47] 3 4 5 6 5 4 3 2 3 4 3 4 3 2 3 4 3 2 1 0 1 0 -1 [70] -2 -3 -4 -5 -4 -5 -6 -5 -4 -3 -2 -3 -2 -3 -4 -3 -4 -5 -4 -3 -2 -1 -2 [93] -1 -2 -3 -4 -3 -4 -3 -2
To test for presence of a "unit root" try the regression: \[\Delta x_t = \beta_0 + \beta_1 t + \beta_2 x_{t-1} + u_t\] \[(x_t - x_{t-1}) = \beta_0 + \beta_1 t + \beta_2 x_{t-1} + u_t\]
If our null hypothesis (\(H_0: \beta_2 = 0\)) is tests whether \(x_t\) is just \(x_{t-1}\) plus some random component.
The other terms test for presence of a trend (\(\beta_1\)) or "drift" (\(\beta_0\)).
Tests for unit roots. The alternatives are: * Explosive change (\(x_t\) multiplies \(x_{t-1}\) in some way) * Convergence (The effect of \(x_{t-1}\) goes away after a while)
But if we have a unit root in our series, then we need to do something about it before we can use that series to answer other questions.