Backshift Notation and Polynomial Roots

Backshift Notation

There are two main notations which are important to learn in time series. We have been writing out equations in the following manner,

\[X_t = \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \cdots + \theta_p \varepsilon_{t-p}.\]

This can be shortened into a more compact format which also allows for other steps to be taken. Let us begin with the backshift notation. It is called the ‘backshift operator \(B\),’ and it is used to help shorten this repetitive indexing of previous values at \(t-1, t-2, \cdots\). So beginning with the simple case of \(BX_t = X_{t-1}\), it can be extended to \(B^2X_t=X_{t-2}\), etc.

So far we have been writing equations with the assumption that the mean is 0, but we will include the \(\mu\) term as we move forward. We can begin with a simple \(MA(2)\) case. \[X_t - \mu = \varepsilon_t + \theta_1 \varepsilon_{t-1}+\theta_2 \varepsilon_{t-2}\] This can be written as, \[X_t - \mu = \varepsilon_t + \theta_1B\varepsilon_t + \theta_2B^2\varepsilon_t\]

\[X_t - \mu = (1 + \theta_1B + \theta_2B^2)\varepsilon_t\]

\[X_t - \mu = \theta(B)\varepsilon_t, \text{ where } \theta(B)=(1 + \theta_1B + \theta_2B^2)\]

With an \(AR(2)\) model, it can be slightly more complicated to write out due to the extra means. I will first show a quick shorthand for writing out \(\phi_0\), then I will show the simpler way of writing the backshift notation.

\[X_t - \mu = \phi_1(X_{t-1} -\mu)+ \phi_2 (X_{t-2} - \mu) + \varepsilon_t\] \[X_t = \mu + \phi_1X_{t-1} - \phi_1\mu+ \phi_2X_{t-2} - \phi_2\mu + \varepsilon_t\] \[X_t = \phi_0 + \phi_1X_{t-1} + \phi_2X_{t-2} + \varepsilon_t, \text{ where } \phi_0 = \mu(1 - \phi_1 - \phi_2)\] Next I will show how it can be done using backshift notation, but first we can pretend that it is mean 0 for the time being.

\[X_t = \phi_1X_{t-1} + \phi_2X_{t-2}+\varepsilon_t\] \[X_t = \phi_1BX_t + \phi_2B^2X_t + \varepsilon_t\] \[\varepsilon_t = X_t - \phi_1BX_t - \phi_2B^2X_t\] \[\varepsilon_t = \phi(B)X_t, \text{ where } \phi(B) = (1 - \phi_1B - \phi_2B^2)\] Then we can simply change the \(X_t\) to \((X_t - \mu)\) to get, \[\varepsilon_t = \phi(B)(X_t - \mu)\]

Polynomial Roots

The above changes in notations are particularly important when it comes to causality and invertibility. Causality is when we can write a series \(X_t\) as an \(MA(\infty)\) process and invertibility is when we can write a series \(X_t\) as an \(AR(\infty)\) process. All \(MA(q)\) processes can be written as an \(MA(\infty)\) and all \(AR(p)\) processes can be written as an \(AR(\infty)\) process. The trick is quite simple, where if for example we have an \(MA(1)\) with \(X_t = \varepsilon_t + \theta_1 \varepsilon_{t-1}\), we can simple say that \(\theta_2 = \theta_3 = \cdots = 0\). The same pattern would apply to an \(AR(1)\), etc.

However, we may have situations where we have an \(MA(q)\) model that we’d like to write as an invertible series, or vice versa with an \(AR(p)\) model as a causal series. The reason is that writing a series in a causal or invertible format makes finding standard errors for prediction limits simpler in the former case and forecasting easier in the latter case.

Let us look at an example of an \(MA(3)\), we have shown that this can be written as, \[X_t = \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \theta_3 \varepsilon_{t-3}.\] Then the polynomial \(\theta(z)\) can be written as,

\[\theta(z) = 1 + \theta_1 z + \theta_2 z^2 + \theta_3 z^3\]

In the case of an \(AR(p)\), the polynomial would be written as, \[\phi(z) = 1 - \phi_1 z - \cdots - \phi_p z^p.\]

The rule is that the absolute value of the roots of the polynomials are greater than 1. In cases where we are looking at an \(MA(1), MA(2), AR(1),\) or \(AR(2)\) process, it is possible to solve this equation relatively easily. However, for longer processes we would want to use a computer to determine the results. It is possible to do also on paper, but it is a lengthier process.

NOTE: For \(AR(1)\) and \(MA(1)\), it is required that \(|\phi| \leq 1\) or \(|\theta|\leq 1\). With an \(AR(2)\) or \(MA(2)\) it is required that \(-1<\frac{\phi_1}{(1-\phi_2)}<1\) or \(-1<\frac{-\theta_1}{(1-(-\theta_2))}<1\). The mathematical way to solve for larger values of \(p\) and \(q\) require using \(\theta(z)\) and \(\phi(z)\). It is interesting but I will not elaborate further.

ARMA(p,q)

So far we have been focusing only on separate \(AR(p)\) and \(MA(q)\) processes, however the same can be applied to an \(ARMA(p,q)\). For example, if we have an \(ARMA(2,2)\) written as, \[X_t = \phi_1 X_{t-1} + \phi_2 X_{t-2} + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2},\] we can simply understand that we need to check the roots of \(\phi_1\) and \(\phi_2\) along with the roots of \(\theta_1\) and \(\theta_2\) as if they were separate \(AR(2)\) or \(MA(2)\) processes.