Applied Time Series

12/10/2018

Outline

Last lecture:

Granger causality
Vectorial ARIMA, with seasonalities
Bayesian method

This lecture:

State space model (Dynamic linear model)
Kalman filter

State Space Model (SSM)

So far, we are in the ARMA type framework: AR, MA, VAR, etc
In fact, this (Box-Jenkins) type of specifications can be fully covered in a more flexible framework - State space models (SSM).
SSM can incorporate time series predictors in a more straightforward manner: on-line parameter and state estimation.
Online parameter/state estimation is a recursive algorithm. To estimate the parameter values at a time step, recursive algorithms use the current measurements and previous parameter estimates. It is more efficient in terms of memory usage.
The cost: adapt yourself to the probabilistic taste.

SSM and Kalman filter

SSM was developped for engineering controls, such as robotics or GPS monitoring.
In the control context, the state variables define the dynamics of the system.
Only some of these state variables can be measured directly, and these measurements are subject to noise.
The objective of the Kalman filter is to infer values for all the state variables from the noisy measurements. The estimated values of the state variables are then used to control the system.
In economics, the states are often the unknown coefficients. However, now we can allow these unknown coefficients to change over the time.

A Nutshell of Control

A general control problem in SSM: let the state vector of the system be \(\theta_t\) and a control vector be \(u_t\): \[ \begin{align} y_t =& F_t \theta_t + v_t \\ \theta_t =& G_t \theta_{t-1} + H_t u_{t} + w_{t} \end{align} \] where \(u_t\) can be regulated by the controller in order to obtain a desired level of the state \(\theta_t\). \(H_t\) is a known matrix. \(v_t\) and \(w_t\) are stochastic errors.
The goal of a control problem is to drive the state from the initial value \(\theta_0\) to a target value \(\theta^{*}\) in a finite time \(T\), setting appropriately the control variables \(u_1, \cdots, u_T\).

SSM without Control

Consider SSM: \[ \begin{align} y_t =& F_t \theta_t + v_t \\ \theta_t =& G_t \theta_{t-1} + w_{t} \end{align} \] where \(v_t \sim \mathcal{N} ( 0 ,\mathbf{V}_t )\), \(w_t \sim \mathcal{N} ( 0 ,\mathbf{W}_t )\) and \(\theta_t \sim \mathcal{N} ( m_t ,\mathbf{C}_t )\)
Alternatively, the uncontrollable “control” term can be thought of mixing with the error \(w_t\).

Example

A linear growth model: \[ \begin{align} y_t =& \mu_t + \beta_{t} p_t + v_t \\ \mu_t =& \mu_{t-1} + w_{1,t}\\ \beta_t =& \beta_{t-1} + w_{2,t} \end{align} \] where \(y_t\) is the individual sales process, \(p_t\) is the price, and \(\mu_t\) is a general level of sales in the sector.
The state space form: \[\theta_t = \left(\begin{array}{c} \mu_{t}\\ \beta_{t} \end{array}\right), \, w_t = \left(\begin{array}{c} w_{1,t}\\ w_{2,t} \end{array}\right), \, F_t^{T} = \left(\begin{array}{c} 1\\ p_{t} \end{array}\right), \, G_t = \left(\begin{array}{cc} 1 & 0\\ 0 & 1 \end{array}\right)\]

Example: hidden state estimate

A random walk plus noise model \[ \begin{align} y_t =& \theta_t + v_t \\ \theta_t =& \theta_{t-1} + w_{t}\\ \end{align} \]

set.seed(2018); W = 0.5; V = 10.0; n = 100
theta0 = rnorm(1,0,1)
w  = rnorm(n,0,sqrt(W))
v  = rnorm(n,0,sqrt(V))
theta  = y = rep(0,n)
theta[1] = theta0   + w[1]
y[1] = theta[1] + v[1]
for (t in 2:n){
  theta[t] = theta[t-1] + w[t]
  y[t] = theta[t] + v[t]
}

Example: hidden state estimate

ts.plot(y, col="blue"); lines(theta, col="red")

Example: hidden state estimate

Use the package dlm (dynamic linear model). The matrices \(\mathbf{F}\), \(\mathbf{V}\), \(\mathbf{G}\), \(\mathbf{W}\), \(\mathbf{C}_0\), \(m_0\) are stored as objects FF, V, GG, W, C0, and m0 in the package. We assume \(\mathbf{V}=10\) and \(\mathbf{W} =0.5\).

library(dlm)
buildMod = dlm(FF = 1, V = 10, GG = 1, W = 0.5, m0 = 0, C0 = 1)
filter = dlmFilter(y, buildMod)

Example: hidden state estimate

ts.plot(theta, col="red"); lines(dropFirst(filter$m), col="black")

Example: hidden state estimate

ts.plot(y, col="blue"); lines(dropFirst(filter$m), col="black")

Example: hidden state estimate

qqnorm(residuals(filter, sd=FALSE))

Example: hidden state estimate

tsdiag(filter)

Kalman Filter

Filtering for SSM is making the best estimate of the current values of the state from the record of observations.
It follows the Bayes’ theorem: \[p(\theta_t | \mathbf{y}_{1:t-1}, y_t) = \frac{p(y_t|\theta_t) p(\theta_t | \mathbf{y}_{1:t-1})}{p(y_t | \mathbf{y}_{1:t-1})} \\ p(\theta_t | \mathbf{y}_{1:t-1}, y_{t}) \propto p(y_t|\theta_t) p(\theta_t | \mathbf{y}_{1:t-1}) \] where \(\mathbf{y}_{1:t-1}= (y_1,\dots,y_{t-1})\) represent the data up until time \(t-1\).
The posterior belief of the state at \(t\), given data up to \(t\), is proportional to the product of the likelihood, and the prior belief of the state at \(t\), given data up to \(t-1\).

Kalman Filter

If the the prior distribution and the likelihood are both normal, the Bayes’ theorem has an analytic form. \[p(\theta_t | y_t,\mathbf{y}_{1:t-1}) = \mathcal{N}(m_t ,C_t)\] where \[m_t = m_{t-1} + \frac{C_{t-1}}{C_{t-1} +W}(y_t - m_{t-1}), C_t =\left(\frac{1}{W}+\frac{1}{C_{t-1}}\right)^{-1}\] is a recursive expression. \(m_t\) is obtained by correcting the previous estimate \(m_{t-1}\). The procedure of computing these values is knowns as the Kalman filter.

A Glimpse

\(y_t = \theta + \varepsilon_t, \varepsilon_t \sim \mathcal{N}(0,W)\), the prior is \(\theta \sim \mathcal{N}(m_0,C_0)\). \[ \scriptsize \begin{eqnarray*} p(\theta | \mathbf{y}_{1:t}) & \propto & \prod_{i=1}^{t} p(y_i|\mathbf{y}_{1:i-1},\theta) p(\theta) \\ & = & \prod_{i=1}^{t} \frac{1}{\sqrt{2\pi W}} \exp \left\{ -\frac{1}{2W}(y_i-\theta)^2 \right\} \frac{1}{\sqrt{2\pi C_0}}\exp\left\{-\frac{1}{2C_0}(\theta-m_0)^{2}\right\} \\ & \propto & \exp \left\{ - \frac{1}{2W} \left(\sum_{i=1}^{t}y_i^2 - 2\theta \sum^{t}_{i=1}y_i + t\theta^2 \right)- \frac{1}{2 C_0}(\theta^2 - 2\theta m_0 + m_0^2 ) \right\} \\ & \propto & \exp \left\{ - \frac{1}{2 W C_0} \left((t C_0 +W)\theta^2 - 2(t C_0 \bar{y} + W m_0 )\theta \right) \right\} \end{eqnarray*} \]

A Glimpse

The last expression \[\scriptsize \exp \left\{ - \frac{1}{2W C_0} \left((t C_0 +W)\theta^2 - 2(t C_0 \bar{y} + W m_0 )\theta \right) \right\}\] is proportional to \[\scriptsize \exp \left\{ - \frac{1}{2 W C_t}(\theta - m_t)^2 \right\}\] where \(m_t = \frac{tC_0 \bar{y} + W m_0}{tC_0 +W}, C_t = \frac{W C_0}{tC_0 +W}\).
This can be written as a recursive form \[\scriptsize m_t = m_{t-1} + \frac{C_{t-1}}{C_{t-1} +W}(y_t - m_{t-1}), C_t =\left(\frac{1}{W}+\frac{1}{C_{t-1}}\right)^{-1}\]

Kalman Filter

The general procedure is summarized as follows:

Posteriror at \(t-1\): \((\theta_{t-1} | \mathbf{y}_{1:(t-1)} ) \sim \mathcal{N}(m_{t-1},C_{t-1})\)
Prior at \(t\): \((\theta_{t} | \mathbf{y}_{1:t-1} ) \sim \mathcal{N}(m_{t-1},R_{t})\)
Marginal likelihood: \((y_{t} | \mathbf{y}_{1:(t-1)} ) \sim \mathcal{N}(m_{t-1},Q_{t})\)
Posteriro at \(t\): \((\theta_{t} | \mathbf{y}_{1:t} ) \sim \mathcal{N}(m_{t},C_{t})\)

where \(R_t = G_t C_{t-1} G_t^{T} + W_t\), \(Q_t = F^T_t R_t F_t + V_t\).

Code

y = y; R1 = 1; theta  = rep(0,n); R  = rep(0,n);
m = rep(0,n);  C = rep(0,n); 
# time t=1
theta[1] = 0; R[1] = 10; Q = R[1]+V
A = R[1]/Q;  m[1] = theta[1]+A*(y[1]-theta[1]); C[1] = R[1]-Q*A^2
# Filtering
for (t in 2:n){
    theta[t] = m[t-1];    R[t] = C[t-1] + W
    Q = R[t]+V; A = R[t]/Q
    m[t] = (1- A)*m[t-1] +A*y[t]; C[t] = A*V
}

Code

ts.plot(dropFirst(filter$m), col="black"); lines(m, lty=3)

Forecast

Predictions start from the posterior estimate of the state, obtained from the Kalman filter, on the time \(t\) on which the forecast is made.
One-step-ahead forecasts follows: \[ \mathbb{E}[y_{t+1} | \mathbf{y}_{1:t}] = F^{T}_{t+1} m_{t+1} \\ \mbox{Var}[y_{t+1} | \mathbf{y_{1:t}}] = F^{T}_{t+1}R_{t+1}F_{t+1} + V_{t+1} = Q_{t+1}\\ y_{t+1} \sim \mathcal{N}(\mathbb{E}[y_{t+1} | \mathbf{y}_{1:t}],\mbox{Var}[y_{t+1} | \mathbf{y_{1:t}}]) \]
This recursive scheme can be extended to \(k\)-steps ahead forecast.

Forecast

mforecast = dlmForecast(filter, nAhead=1);
mforecast$f

##          [,1]
## [1,] 1.831276

mforecast$Q

## [[1]]
##      [,1]
## [1,] 12.5

Smoothing

One of the attractive features of state space models is that estimation and forecasting can be applied sequentially, as new data become available.
However, when we obtain the observations up to time \(T\), we want to retrospectively reconstruct the behavior of the system.
In this case, one can use a backward-recursive algorithm to compute the conditional distributions of \(\theta_t\) given \(\mathbf{y}_{1:T}\), for any \(t < T\), starting from the filtering distribution \(p(\theta_T|\mathbf{y}_{1:T})\) and estimating backward all the states’ history.

Smoothing

The smoothing scheme is to look back to the system: retain the essential information and dump the unnecessary information.
For one step back, the procedure follows: \[p(\theta_{t-1}|\mathbf{y}_{1:t}) = \int p(\theta_{t-1} | \theta_t, \mathbf{y}_{1:t}) p(\theta_t | \mathbf{y}_{1:t}) d \theta_t\]

smooth = dlmSmooth(y, buildMod);

Smoothing

ts.plot(dropFirst(filter$m), col="black"); 
lines(dropFirst(smooth$s), col="grey")

Remark

To estimate the state vector we compute the conditional densities \(p(\theta_s | \mathbf{y}_{1:t})\).
We distinguish between problems of filtering (when \(s = t\)), forecasting (\(s > t\)) and smoothing (\(s < t\)).
It is worth underlining the difference between filtering and smoothing. In the filtering problem, the data are supposed to arrive sequentially in time. While in smoothing, it is a retrospective analysis where all the observations are collected.

Summary

An application of the probabilistic view
A recursive filter
Forecasting and smoothing

Last lecture (optional): review, start at 9:30am