Univariate time-series models are a class of specifications where one attempts to model and to predict financial variables using only information contained in their own past values and possibly current and past values of an error term.
Time series models are usually a-theoretical, implying that their construction and use is not based upon any underlying theoretical model of the behaviour of a variable
Is one whose properties and behaviours do not depend on the time at which the series is observed
A strictly stationary process is one where, for any \(t_1, t_2, ..., t_T ∈ Z, \;\text{any}\; k ∈ Z \; \text{and} \; T = 1,2, ...\) \[F_{y_{t_1},y_{t_2},...,y_{t_T}}(y_{t_1}, y_{t_2}, ..., y_{t_T})=F_{y_{{t_1}+k},y_{{t_2}+k},...,y_{{t_T}+k}}(y_{t_1}, y_{t_2}, ..., y_{t_T})\] where F denotes the joint distribution function of the set of random variables
A series is strictly stationary if the distribution of its values remains the same as time progresses, implying that the probability that y falls within a particular interval is the same now as at any time in the past or the future.
If a series satisfies \((1) − (3)\) for \(t = 1, 2, . . . , \infty\), it is said to be weakly or covariance stationary
A stationary process should have a constant covariance. The auto-co-variances determine how y is related to its previous values, and for a stationary series they depend only on the difference between \((t_1 \;\text{and}\; t_2 )\rightarrow (y_t\) and \(y_{t−1})\) is the same as the covariance between \(y_{t−10}\) and \(y_{t−11}\)
The moment $E(y_t-E(y_t))(y_{t-s}-E (y_{t-s}))=_s<$
is known as the autocovariance function. When \(s = 0\), the autocovariance at lag zero is obtained, which is the autocovariance of \(y_t\) with \(y_t\), i.e., the variance of y. These covariances, \(γ_s\), are also known as autocovariances since they are the covariances of y with its own previous values. The autocovariances are not a particularly useful measure of the relationship between y and its previous values, however, since the values of the autocovariances depend on the units of measurement of \(y_t\), and hence the values that they take have no immediate interpretation.
It is thus more convenient to use the autocorrelations, which are the autocovariances normalised by dividing by the variance \[\tau _s=\frac{\gamma_s}{\gamma_0}\;\;\;s=0,1,2,...\] The series \(\tau_s\) now has the standard property of correlation coefficients that the values are bounded to lie between ±1. In the case that s = 0, the autocorrelation at lag zero is obtained, i.e., the correlation of \(y_t\) with \(y_t\), which is of course 1. If \(\tau_s\) is plotted against s = 0,1,2, …, a graph known as the autocorrelation function (acf) or correlogram is obtained.
Remark:
A white noise process (for disturbance errors) is a stationary process, with no discernible (perceptible) structure:
\(\mathbb E(u_t)=\mu_u ;\;\;t =1,2,..,\infty\) White noise process has constant mean
\(\text{var}(u_t)=\sigma^2_u < \infty\) White noise process has constant variance
\(\gamma_s=\left\{\begin{matrix} \sigma^2_u &\text{if}\; s=0 \\ 0 & \text{if}\; s \neq 0 \end{matrix}\right.\)
Each observation is uncorrelated with all other values in the sequence. Thus a white noise process has zero auto-co-variances, except at lag zero. Another way to state this last condition would be to say that each observation is uncorrelated with all other values in the sequence.
Hence the autocorrelation function for a white noise process will be zero apart from a single peak of 1 at s = 0. \[\tau_s=\frac{\gamma_s}{\gamma_0}=\left\{\begin{matrix}
1 &\text{if}\; s=0 \\
0 & \text{if}\; s \neq 0
\end{matrix}\right.\] If \(\mu_u=0\), and the three conditions hold, the process is known as zero mean white noise.
Furthermore, If it is assumed that \(\mu_t\) is distributed normally, then the sample autocorrelation coefficients are also approximately normally distributed
\[\hat{\tau}\sim \mathbb N(0, \frac{1}{T})\] where T is the sample size, and \(\hat{\tau}\) denotes the autocorrelation coefficient at lag s estimated from a sample
\[\begin{matrix} H_0: \tau_s=0\\ H_A: \tau_s\neq 0 \end{matrix}\] \(95\%\) non - rejection region: \((-1.96\times\frac{1}{\sqrt{T}},1.96\times\frac{1}{\sqrt{T}})\)
If the sample autocorrelation coefficient \(\tau_s\) falls outside this region for a given value of s, then the null hypothesis that the true value of the coefficient at that lag s is zero is rejected.
Alternatively, we could calculate \(TS=\hat{\tau_s}\times\sqrt T\)
As for any joint hypothesis test, only one autocorrelation coefficient needs to be statistically significant for the test to result in a rejection. However, the Box– Pierce test has poor small sample properties, implying that it leads to the wrong decision too frequently for small samples. A variant of the Box–Pierce test, having better small sample properties, has been developed. The modified statistic is known as the Ljung–Box (1978) statistic.
\[TS = Q^*=T\times(T+2)\times\sum^{m}_{k=1}\frac{\hat{\tau_{S_k}}^2}{T-K}\sim\chi^2_m\] It should be clear from the form of the statistic that asymptotically (that is, as the sample size increases towards infinity), the (T + 2) and (T − k) terms in the Ljung–Box formulation will cancel out, so that the statistic is equivalent to the Box–Pierce test. This statistic is very useful as a portmanteau (general) test of linear dependence in time series.
Suppose that a researcher had estimated the first five autocorrelation coefficients using a series of length 100 observations, and found them to be
lag | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
Autocorrelation coefficient | 0.207 | -0.013 | 0.086 | 0.005 | -0.022 |
Test each of the individual correlation coefficients for significance, and test all five jointly using the Box–Pierce and Ljung–Box tests.
Constructing a non-rejection region: 95% non-rejection region: \((-1.96\times\frac{1}{\sqrt T}, 1.96\times\frac{1}{\sqrt T})\) where T = 100 in this case. The decision rule is thus to reject the null hypothesis that a given coefficient is zero in the cases where the coefficient lies outside the range (−0.196,0.196). For this example, it would be concluded that only the first autocorrelation coefficient is significantly different from zero at the 5% level.
Box-Pierce and Ljung-Box test:Turning to the joint tests, the null hypothesis is that all of the first five autocorrelation coefficients are jointly zero, i.e.
\[H_0: \tau_{S_1}=0, \tau_{S_2}=0, \tau_{S_3}=0, \tau_{S_4}=0, \tau_{S_5}=0\] or simply \[H_0: \tau_{1}=0, \tau_{2}=0, \tau_{3}=0, \tau_{4}=0, \tau_{5}=0\] The test statistics for the Box–Pierce and Ljung–Box tests are given respectively, as
\[Q=T\sum^{m}_{k=1}\hat{\tau_{S_k}}^2=100×(0.207^2 + −0.013^2+ 0.086^2 + 0.005^2 + −0.022^2=5.09\] \[Q^*=T(T+2)\times\sum^{m}_{k=1}\frac{\hat{\tau_{S_k}}^2}{T-k}=100\times102\times(\frac{0.207^2}{100-1} + \frac{−0.013^2}{100-2}+ \frac{0.086^2}{100-3} + \frac{0.005^2}{100-4} + \frac{−0.022^2}{100-5}=5.09\]
The PACF is useful for telling the difference between an AR process and an ARMA process.
In the case of an AR(p), there are direct connections between \(y_t\) and \(y_{t-s}\) only for s ≤ p. So for an AR(p), the theoretical PACF will be zero after lag p.
In the case of an MA(q), if it is invertible (roots of characteristic equations \(\theta(\text{z})=0\) lie outside the unit circle), it can be written as an AR(∞), so there are direct connections between \(y_t\) and all its previous values. For an MA(q), the theoretical PACF will be geometrically declining.
By combining the AR(p) and MA(q) models, an ARMA(p, q) model is obtained. Such a model states that the current value of some series y depends linearly on its own previous values plus a combination of current and previous values of a white noise error term. The model could be written
For \(\phi(L)=1-\phi_1L-\phi_2L^2-...-\phi_pL^p\)
And \(\theta(L)= 1+\theta_1L+\theta_2L^2+...+ \theta_qL^q\), we have \[\Phi(L)y_t=\mu+\theta(L)u_t\], or \[y_t=\mu + \phi_1y_{t-1}+\phi_2y_{t-2}+...+\phi_py_{t-p}+\theta_1u_{t-1}+\theta_2u_{t-2}+...+ \theta_qu_{t-q}+u_t\]
Stat Series: \(y_t\) is stat when: \[\left\{\begin{matrix} E(y_t)=\mu\\ Var(y_t)=\sigma^2<\infty\\ Cov(y_t, y_{t-s})=\delta_s \end{matrix}\right.\]
CLRM expect x affect y
Non stat series?
\(\Rightarrow\) Trend
\(\Rightarrow\) Seasonality
\(\Rightarrow\) Random walk
For 16% of case s, \(R^2\geq 0.5\)
y,x ind
y,x not stat
\[\Rightarrow TS=\frac{\hat{\beta}}{SE(\hat{\beta})}\] of case for 98% \(|TS|>2\)
Two types of non-stationary
The random walk with drift \(y_t=p+y_{t-1}+\mu_t\)
The deterministic trend process \(y_t=\alpha+\beta t+\mu_t\)
Ex: Check AR(1) \(y_t=y_{t-1}+\mu_t\)
CE \((1-L)y_t=\mu_t\)
Solve \(\Phi(z)=0\Leftrightarrow 1-z=0 \Leftrightarrow z=1\)
\(\Rightarrow\) Non-stationary since \(|z|<1\)
Use \(z_t=\Delta y_t=y_t-y_{t-1}=u_t\) stationary
Consider AR(1) process:
\(y_t=\Phi y_{t-1}+\mu_t= \Phi(\Phi y_{t-2}+u_{t-1})+u_t=\Phi^2y_{t-2}+\Phi u_{t-1}+ u_t\)
Case 1: Stationary \(\Phi <1\) $$
Case 2: Non-Stationary \(\Phi \geq 1\) \(y_t=y_0+\Sigma_{t=0} u_t\)
d: integrated of order \(y_t\sim I_d \rightarrow \Delta^d y_t\sim I_0\)
*\[y_t \sim I(t)\; \text{if}\;\left\{\begin{matrix} y_t \text{ is not stat}\\ \Delta y_t\sim I(0) \text{ stat} \end{matrix}\right.\]