In this document we simulate a stochastic volatility hidden Markov model and demonstrate our implementation of

After simulating data for the hidden Markov model, we apply the three filtering algorithms, assess their performance and discuss some of the associated theory.

Introduction

We consider a hidden Markov model for stochastic volatility. The model is comprised of a Markov chain \(X_n\) modeling the volatility of an asset and the observed returns of the asset \(Y_n\). The model is defined as follows:

\[\begin{align} X_n &= \rho X_{n-1} + \sigma V_n, &V_n \sim N(0,1)\\ Y_n &= \tau \exp \left( \frac{X_n}{2} \right) W_n, &W_n \sim N(0,1) \end{align}\]

for constants \(\rho, \tau, \sigma \in \mathbb{R}\). We assume that \(X_0 \sim N \left(0, \frac{\sigma^2}{1-\rho^2} \right)\). Note that \(V_n,W_n\) are noise terms.

Our task is to infer the unobservable or hidden state \(X_n\) from the history of observations \(Y_0, \dots, Y_n\). Expressed in a more precise form, the filtering problem consists in estimating \[ \mathbb{P}(X_n \in \cdot \mid Y_0, \dots, Y_n). \] Equivalently, we could ask about estimating the expectation \[ \mathbb{E}(\varphi(x_n) \mid y_0, \dots y_n) = \int \varphi(x_n) \, p(d x_n \mid y_0, \dots y_n) \] for any (sufficiently regular) function \(\varphi(x_n)\).

Simulating the hidden Markov model

We begin by simulating the hidden states \(X_n\) and observations \(Y_n\). Below, we will then infer the state \(X_n\) using only knowledge of the observations \(Y\) and the parameters of the model. We choose the parameters

\[ \rho = 0.9, \quad \sigma = 1.05, \quad \tau = 0.7. \] Running a simulation of the hidden and observable states yields the following trajectories:

We store the resulting time series in the vectors X (hidden states) respectively Y (observed states).

Sequential Importance (Re-)Sampling

We implement sequential importance sampling (SIS), sequential importance resampling (SIR) and a resample move particle filter (RMPF). The algorithms are sequential versions of importance sampling and can be used for online inference in hidden Markov models.

For each time step \(n\), we simulate a population of \(P\) particles \(X^{(i)}_n, i = 1, \dots P\) together with associated weights \(W^{(i)}_n\). The weights satisfy \(\sum_{i=1}^P W^{(i)}_n = 1\). To estimate the expectation of a function \(\varphi(x_n)\), given that we have observed \(y_0, \dots y_n\), we can use the following formula:

\[\begin{align} \mathbb{E}(\varphi(x_n) \mid y_0, \dots y_n) &= \int \varphi(x_n) \, p(d x_n \mid y_0, \dots y_n) \\ &\approx \sum_{i=1}^P \varphi(X_n^{(i)}) W^{(i)}_n. \end{align}\]

A detailed description of the SIR/SIS algorithms can be found in this Wikipedia article on particle filters. For an explanation of the resample move particle filter, see the Advanced Computational Methods in Statistics course by Nikolas Kantas.

The function SequentialImportanceSampling accepts the observed data Y, the model parameters, as well as several hyperparameters. All three algorithms use the bootstrap proposal. The MCMC (Markov chain Monte Carlo) algorithm used for the resampling move is Random walk Metropolis Hastings. The implementation of the algorithms is contained in the particlefilters.R file, which is also part of this repository.

N <- 700 # number of particles
SIS <- SequentialImportanceSampling(data = Y,
                                    params = params,
                                    num_particles = N)
SIR <- SequentialImportanceSampling(data = Y,
                                    params = params,
                                    num_particles = N,
                                    resample = TRUE)
RMPF <- SequentialImportanceSampling(data = Y,
                                     params = params,
                                     num_particles = N,
                                     resample = TRUE,
                                     move = TRUE,
                                     rsmpl_steps = 5,
                                     stepSize = 4)

We compute the root mean squared error for each algorithm to be

SIS SIR RMPF
Root mean squared error 2.335739 1.118346 1.055472

Fine tuning of MCMC step size

The step size of the resampling move needs to be finetuned to ensure the acceptance ratio of the Metropolis Hastings algorithm is in a good range (between \(0.2\) and \(0.4\)). Here we adjusted the step size manually for the given model parameters. In practice, conducting a hyperparameter search might be advantageous. For the current run we obtained a mean acceptance ratio of 0.24 with variance 0.00027.

Weight degeneracy

We observe that the SIS algorithm tracks the true state well for a few (~ 30) time steps and after that it diverts significantly from the true hidden state. This may be explained by weight degeneracy: as \(n\) grows, the weights \(W^{(i)}_n\) of most particles tend to \(0\), while a few particles have weights very close to \(1\). This effectively renders most of the simulated particles ineffective. The SIR algorithm addresses this issue by taking inspiration from genetic algorithms and resampling particles according to the distribution of weights.

We can assess the effect of weight degeneracy using effective sample size. It is defined in terms of the weights at time \(n\) as \[ \text{ESS}_n = \left( \sum_{i=1}^P \left(W^{(i)}_n\right)^2 \right)^{-1} \in [1,P] \] An effective sample size of \(P\) (the total number of particles used in the simulation) indicates that the algorithm performs as well as perfect Monte Carlo, while an effective sample size close to \(1\) suggests poor performance. Below we plot \(\text{ESS}_n\) for all three algorithms.

The mean effective sample size for each of the algorithms is

SIS SIR RMPF
Effective sample size (rounded) 6 466 464
Number of particles in simulation 700 700 700

Path degeneracy

Resampling in the SIR algorithm introduces the issue of path degeneracy. This effectively means that marginals of the form \(p(x_{n-L}:n \mid y_{0}, \dots y_n )\) for small \(L\) will be well approximated at the expense of losing particle diversity for early times. To counteract path degeneracy, the RMPF algorithm moves each particle \(X^{(i)}\) by using a few iterations of a Markov chain Monte Carlo (MCMC) algorithm targeting the correct density. This re-introduces particle diversity.

Monte Carlo variance

Finally, we compute the Monte Carlo variance for the mean for each of the three algorithms. We simulate multiple runs of each algorithm and compute the variance of the estimated mean as a function of time.