Mixture Models
- Mixture models naturally arise when measurements of individuals within a population can be considered to arise from different distributions
- For example, male and female
- For \(y = (y_1,...,y_m)\), the \(M\) component mixture distribution is:
- \(f_m(y_i|\theta_m)\) is the distribution of \(y_i\) for the \(m\) component model
- Each distribution often from the same parametric family
- \(\lambda_m\) is the proportion of \(y_i\) from component \(m\)
- \(\sum\lambda_m = 1\)
- \(f_m(y_i|\theta_m)\) is the distribution of \(y_i\) for the \(m\) component model
\[ f(y_i|\theta,\lambda) = \sum^M_{m=1}\lambda_mf_m(y_i|\theta_m) \]
Fitting Mixture Models
- Consider an indicator variable:
\[ z_{im} = \begin{cases}1,\quad\text{if } y_i \text{ is drawn from mth component}\\0,\quad\text{otherwise}\end{cases}\]
- Distribution of this indicator variable \(z_i = (z_{i1},...,z_{iM})\):
- This distribution is such that \(P(z_{im}=1) = \lambda_m\)
\[ \pi(z_i|\lambda)\sim Multinomial(1:\lambda_1,...,\lambda_m) = \prod^M_{m=1}\lambda_m^{z_{im}}\]
Mixture Joint Distribution
- Distribution of \(y_i\):
\[ f(y_i|z_i,\theta) = \prod^M_{m=1}f_m(y_i|\theta_m,z_i)^{z_{im}}\]
- Therefore the likelihood:
\[ L(y|z,\theta) = \prod^n_{i=1} f(y_i|z_i,\theta) = \prod^n_{i=1}\left[\prod^M_{m=1}f_m(y_i|\theta_m,z_i)^{z_{im}}\right]\]
- Distribution of \(z|\lambda\):
\[ \pi(z|\lambda) = \prod^n_{i=1}\prod^M_{m=1}\lambda_m^{z_{im}}\]
Mixture Prior Distribution
- Prior for \(\theta,\lambda\) typically assumes independence:
\[ \pi(\theta,\lambda)=\pi(\theta)\pi(\lambda)\]
- Natural prior for \(\lambda\) is $Dirichlet(_1,…,_m)
- Conjugate with multinomial
- Multivariate version of beta: \(\lambda_i\sim Beta(\alpha_i,\sum_{j\neq i}\alpha_j)\)
\[ \pi(\lambda)\propto\prod^M_{m=1}\lambda_m^{\alpha_m-1}\]
Mixture Posterior Distribution
- The mixture posterior distribution is then:
- Commonly \(\pi(\theta)=\prod_m\pi_m(\theta_m)\)
\[ \pi(\theta,z,\lambda|y)\propto L(y|z,\theta)\pi(z|\lambda)\pi(\lambda)\pi(\theta)\]
\[ \propto \prod^n_{i=1}\left[\prod^M_{m=1}[\lambda_mf(y_i|\theta_m)]^{z_{im}}\right]\left[\prod^M_{m=1}\lambda_m^{\alpha_m-1}\right]\pi(\theta)\]
MCMC Sampler
The indicator variables make creating full conditional distributions easy. Therefore we can use Gibbs Sampler on these distributions:
Update \(\theta_m\)
\[ \pi(\theta_m|y,z,\lambda,\theta_{-m}) \propto \prod^n_{i=1}f(y_i|\theta_m)^{z_{im}}\pi_m(\theta_m|\theta_{-m})\]
- Update \(\lambda\)
\[ \pi(\lambda|y,\theta,z) \propto \prod^M_{m=1}\lambda_m^{\sum_i z_{im}+\alpha_m - 1}\]
- Update \(z_i\):
\[ \pi(z_i|y,\theta,\lambda,z_{-i}) \propto \prod^M_{m=1}[\lambda_mf(y_i|\theta_m)]^{z_{im}}\]
- \(P(z_{im}=1|\lambda,\theta) \propto \lambda_mf(y_i|\theta_m)=p_{im}\), with normalised probabilities \(\tilde p_{im}=p_{im}\sum_m p_{im}\)
- Therefore, the full conditional is
\[ z_i\sim Multinomial(1:\tilde{p_{i1}},...,\tilde{p_{iM}})\]
Inference - Predictive Distribution
- The predictive distribution is:
\[ f(y|x) = \int\sum^M_{m=1}\lambda_mf_m(y|\theta_m)\pi(\theta,z,\lambda|x)d\theta d\lambda dz\]
- For classification:
- Can approximate through montecarlo
\[ P(z_{(x)m}=1|x,y)\propto\int\lambda f(y|\theta_m)\pi(\theta,z,\lambda|x)d\theta d\lambda dz\approx \frac{1}{N}\sum^N_{j=1}\lambda^{(j)}_m f(y|\theta^{(j)}_m)\]
- or can consider the proportion of times \(y_i\) is allocated to component \(m\)
\[ \mathbb{E}_\pi[I(z_{im}=1)]\approx\frac{1}{N}\sum^N_{j=1}z_{im}^{(j)}\]
Non-Identifiability of Mixture Models
- Relabeling components by arbitrarily permuting index set \(1,...,M\) for \((\lambda_m,\theta_m,z_{im})\) does not change mixture pdf value
- Caused by assumed changeability
- All \(\theta_m\) have the same marginal posterior distribution due to label switching
- Can resolve this issue through prior information
- Constrain parameter space to unique subregion to uniquely identify model through unique permutation
- Order weights:
\[ \lambda_1 > ... > \lambda_M\]
- Or order parameters:
\[ \theta_1 > ... > \theta_M\]
- Or use an informative prior identifying components with specific sub populations