Last lecture:
ARMA modeling procedures: Model selection, estimation, testing and forecasting based on stationary data.
This lecture:
Trends, integrated series, cycles, ARIMA models
11/05/2018
Last lecture:
ARMA modeling procedures: Model selection, estimation, testing and forecasting based on stationary data.
This lecture:
Trends, integrated series, cycles, ARIMA models
Trends: long-term changes in the mean level. (Trends can be either deterministic or stochastic. Stochastic trends are not necessarily monotonic.)
Periodic signals: cycle and seasonal components. (For peridocities approaching the length of the time series, it becomes difficult to discriminate these from stochastic trends.)
Irregular component - random or chaotic noisy residuals left over after removing all trends and periodic components.
For example, one observed time series can be decomposed into
\[ \mbox{Series} = \mbox{ARMA term} + \mbox{trend} + \mbox{cycle and season} + \mbox{irregular component},\]
library(Quandl); library(xts); library(lubridate); library(forecast); library(lmtest)
Bel = Quandl("ECB/STS_M_BE_N_UNEH_LTT000_4_000", type = "xts", collapse = "monthly", start_date="2000-01-31")
plot(Bel, main="Belgium Unemployment Level")
Acf(Bel, lag=10)
plot(stl(Bel, s.window="periodic"))
A stochastic trend: Recall in AR(1) if \(\phi=1\) then \[Y_{t}=Y_{0}+\sum_{j=1}^{t}\varepsilon_{j},\quad \mbox{Var}(\sum_{j=1}^{t}\varepsilon_{j})=t\sigma^{2}.\]
Recall characteristic equation in AR(p): \(\phi(\mathbb{L})Y_t = \varepsilon\).
For stationary, the roots of the characteristic equation (i.e., the polynomial of \(\phi(z)\)) must all exceed unity in absolute value.
Unit root: \(\phi(z)\) contains \(1-z\) or \(z-1\).
The random walk with drift \[ Y_t = \beta + Y_{t-1} + \varepsilon_t.\]
Repeated substitution reveals the nature of two trends \[ Y_t = \beta + \beta + Y_{t-2} + \varepsilon_{t-1} + \varepsilon_t \] \[ = \beta t + Y_{0} + \sum_{i=1}^{t-1}\varepsilon_{t-i} + \varepsilon_t \]
The term \(Y_0 + \beta t\) is the deterministic trend.
The term \(\sum_{i=1}^{t-1}\varepsilon_{t-i}\) is the stochastic trend.
set.seed(2018); e = as.ts(rnorm(250)); y = rep(0,250)
for(t in 1:250){y[t+1]=y[t]+e[t]}
# or try arima.sim(list(order = c(0,1,0)), n = 250)
y = as.ts(y); plot(y)
Acf(y, lag=10)
plot(diff(y, differences =1))
Acf(diff(y, differences =1))
set.seed(2018); e = as.ts(rnorm(250)); y = rep(0,250)
for(t in 1:250){y[t+1]= 1 + y[t]+e[t]}
y = as.ts(y); plot(y)
Differentiation is for an infinitesimal (infinitely small) change.
Differentiation of \(y(t)\) with respect to time: \[ \frac{dy(t)}{dt} = \lim_{e\rightarrow 0}\frac{y(t)- y(t-e)}{e} \]
The equation \(\frac{dY(t)}{dt} = \varepsilon_t\) is a Stochastic Differential Equation (SDE). The discrete version of differential operator \(\frac{dY(t)}{dt}\) contains a unit root: \[\frac{Y_t - Y_{t-1}}{t-(t-1)} = Y_t - Y_{t-1} = (1- \mathbb{L}) Y_t.\]
Stationarity is embedded in the non-stationary series.
A process (when time is continuous) is gobally non-stationary but it is stationary in an infinitesimal change of time.
A series (when time is discrete) is gobally non-stationary but it is stationary in an incremental unit of time.
AR(2): \(Y_t = 0.5Y_{t-1} + 0.5Y_{t-2} +\varepsilon_t\) is non-stationary. \[- \frac{1}{2}(z^2 + z - 2)= - \frac{1}{2}(z - 1)(z + 2)=0\] gives \(z=-2\) or \(z=1\) which is equivalent to \[- \frac{1}{2}(\mathbb{L}^2 + \mathbb{L} - 2)Y_t = - \frac{1}{2}(\mathbb{L} - 1)(\mathbb{L} + 2)Y_t = \varepsilon_t\] \[ \frac{1}{2}(\mathbb{L} + 2) \nabla Y_{t} = \varepsilon_t.\]
We can think \(\nabla Y_{t} = X_t\), then an AR(1) with a stationary \(X_t\) \[ X_t = -\frac{1}{2} X_{t-1} + \varepsilon_t.\]
Difference stationary: For an ARMA process \(\phi(\mathbb{L})Y_{t}=\theta(\mathbb{L})\varepsilon\), if \(\phi(z)=0\) has one root on the unit circle and the others outside the unit circle, this ARMA process is a difference stationary process.
If \(Y_{t}\) is difference stationary then we say that \(Y_{t}\) is integrated of order \(1\) and we denote \(Y_{t}\sim I(1)\).
If \(Y_{t}\sim I(1)\), then \(\phi^{*}(\mathbb{L}) \nabla Y_{t}=\theta(\mathbb{L})\varepsilon_{t}\) is stationary where \(\phi^{*}(\mathbb{L}) (1-\mathbb{L}) = \phi(\mathbb{L})\).
If the ARMA(p+1,q) process \(\phi(\mathbb{L})Y_{t}=\theta(\mathbb{L})\varepsilon_t\) is difference stationary, then \(\phi(\mathbb{L})\) can be factored as \[\phi(\mathbb{L})=(1-\mathbb{L})\phi^{*}(\mathbb{L})\] where \(\phi^{*}(\mathbb{L})=0\) has \(p\) roots outside the unit circle.
In this case, \(\nabla Y_{t}\) has the stationary ARMA(p,q): \[\nabla Y_{t}=\phi^{*}(\mathbb{L})^{-1}\theta(\mathbb{L})\varepsilon_{t}=\Psi^{*}(\mathbb{L})\varepsilon_{t}\] where \(\Psi^{*}(\mathbb{L})=\sum_{k=0}^{\infty}\psi_{k}^{*}\mathbb{L}^{k}\) is from the Wold decompsition.
This ARMA(p+1,q) is in fact ARIMA(p,1,q).
Unemployment rate, as an economic indicator, is a seasonally adjusted series.
Fluctuations due to seasonal events including changes in weather, harvests, major holidays, and school schedules.
To the end of the academic year, when school and university leavers are seeking work.
ARIMA \((p,d,q) (P,D,Q)_m\)
\((p,d,q)\) for non-seasonal part
\((P,D,Q)_m\) for seasonal part, \(m\) for number of periods per cycle, e.g. quarterly cycle (\(m=4\)), annually cycle (\(m=12\)).
ARIMA \((1,1,1) (1,1,1)_4\) is \[ (1- \phi \mathbb{L})(1- \Phi \mathbb{L}^4) (1 - \mathbb{L}) (1 - \mathbb{L}^4) Y_t = (1 + \theta \mathbb{L})(1+ \Theta \mathbb{L}^4) \varepsilon_t.\]
Bel.diff = diff(Bel, differences =1) plot(Bel.diff)
Acf(Bel.diff)
Bel.arima = arima(Bel,order=c(1,1,1), method="ML",
seasonal = list(order = c(1,1,1), 12))
coeftest(Bel.arima)
## ## z test of coefficients: ## ## Estimate Std. Error z value Pr(>|z|) ## ar1 0.382884 0.079161 4.8367 1.320e-06 *** ## ma1 0.371303 0.063652 5.8333 5.434e-09 *** ## sar1 0.189544 0.072697 2.6073 0.009126 ** ## sma1 -0.999994 0.079037 -12.6522 < 2.2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(Bel.arima$resid)
Acf(Bel.arima$resid)
plot(forecast(Bel.arima, h=5), ylab="", xlab="Year")
Extend the stationary ARMA model.
ARMA(p,q) with an integrated process I(1) gives ARIMA(p,1,q). Stationarity exists after the difference.
Seasonal factors as additional AR, MA terms.