Introduction

We frequently look at assets or economic series that have different regimes or members. If we model the asset or series as a single entity, we are missing some of the important underlying information. For example, we might want to model GDP growth or the performance of the stock market as having two regimes: boom and recession. To understand this we could use a mixture model.

Mixture model

This is a mixture of distributions for different regimes. In our case, there is one distribution for the boom and one distribution for the recession. If we assume that each of these can be approximated by a normal distribution we would describe the two regimes with two means and two standard deviations.

Exercise

For the return on the S & P 500 and for the rate of GDP growth, decide whether mean or standard deviation are higher in boom or recession. Why?

Mixtools package

There is a package called mixtools that will fit mixture models in R. You can find details of the package here:

Mixtools

This will use maximum likelihood to estimate the parameters of the mixture model as the most likely given the data. For some problems (such as ordinary least squares) there is an analytical solution that can be used to find the most likely coefficients; for many other problems the maximum likelihood can only be found by numerical methods.

There is more mathematical detail about maximum likelihood here:

Stackquest Maximum Likelihood

An example

library(mixtools)
data(faithful)
hist(faithful$waiting, main = "Time between Old Faithful erruptions", xlab = "minutes")

Now use the normalmixEM function to estimate the parameters of a distribution. The key arguments are lambda which is the starting point for estimating share or proportion, mu which would be the mean and sigma as the estimated standard deviation.

wait <- normalmixEM(faithful$waiting, lambda = 0.5, mu = c(55, 80), sigma = 5)
## number of iterations= 9
wait[c('lambda', 'mu', 'sigma')]
## $lambda
## [1] 0.3608498 0.6391502
## 
## $mu
## [1] 54.61364 80.09031
## 
## $sigma
## [1] 5.869089 5.869089

We can look at the estimated parameters of lambda, mu and sigma and plot the normal distribution relative to the actual data. To do this we use the density function and input the estimated values of the two normal distributions.

hist(faithful$waiting, probability = TRUE, main = "Time between Old Faithful erruptions", xlab = "Time")
lines(density(rnorm(100, wait$mu, wait$sigma)), col = 'red')

Exercise

  • Try to repeat this exercise for the returns on the S&P 500. You can use a normal distribution and try to extract mean and standard deviations for a two-regime model. Do you get what you expect?