August 2, 2017
Motivation
The ZINB-WaVE model
Comparison with existing approaches
Wagner, Regev, Yosef (2016)
The mean of the NB distribution is \(\mu\) and its variance is: \[ \sigma^2 = \mu + \frac{\mu^2}{\theta}. \] In particular, the NB distribution boils down to a Poisson distribution when \(\theta = +\infty\).
For any \(\pi\in[0,1]\), the PMF of the zero-inflated negative binomial (ZINB) distribution is given by
\[ f(y;\mu,\theta, \pi) = \pi \delta_0(y) + (1 - \pi) f_N(y;\mu,\theta), \quad \forall y\in\mathbb{N}, \]
where \(\delta_0(\cdot)\) is the Dirac function.
Here, \(\pi\) can be interpreted as the probability that a \(0\) is observed instead of the actual count, resulting in an inflation of zeros compared to the NB distribution, hence the name ZINB.
Given \(n\) samples and \(J\) genes, let \(Y_{ij}\) denote the count of gene \(j\) (for \(j=1,\ldots,J\)) for sample \(i\) (for \(i=1,\ldots,n\)).
ZINB-Wave preprint: doi.org/10.1101/125112
zinbwave package: bioconductor.org/packages/zinbwave
These slides: rpubs.com/daviderisso
To estimate the parameters \((\alpha, \beta, \gamma, W, \theta)\), we follow a penalized maximum likelihood approach, to reduce overfitting and improve the numerical stability of the optimization problem in the setting of many parameters.
\[ \max_{\beta,\gamma,W,\alpha,\zeta} \left\{\ell(\beta,\gamma,W,\alpha,\zeta) - \text{Pen}(\beta,\gamma,W,\alpha,\zeta) \right\}\,, \]
where \(\zeta = \log \theta\) and
\[ \text{Pen}(\beta,\gamma,W,\alpha,\zeta) = \frac{\epsilon_{\beta}}{2} \|\beta^0\|^2 + \frac{\epsilon_{\gamma}}{2} \|\gamma^0\|^2 + \frac{\epsilon_{W}}{2}\|W\|^2 + \frac{\epsilon_{\alpha}}{2}\|\alpha\|^2 + \frac{\epsilon_{\zeta}}{2}\text{Var}(\zeta) \,. \]
The penalized likelihood is not concave, making its maximization computationally challenging.
We instead find a local maximum, starting from a smart initialization and iterating a numerical optimization scheme until local convergence.
\[ \hat{\zeta} \leftarrow \arg\max_{\zeta} \left\{\ell(\hat{\beta}, \hat{\gamma}, \hat{W}, \hat{\alpha}, \zeta) - \frac{\epsilon_\zeta}{2} \text{Var}(\zeta) \right\} \,. \]
\[ \left(\hat{\gamma}, \hat{W} \right) \leftarrow \arg\max_{\left(\gamma, W\right)} \left\{\ell(\hat{\beta}, \gamma, W, \hat{\alpha}, \hat{\zeta}) - \frac{\epsilon_\gamma}{2} \|\gamma^0\|^2 - \frac{\epsilon_W}{2} \|W\|^2 \right\}\,. \]
\[ \left(\hat{\beta}, \hat{\alpha} \right) \leftarrow \arg\max_{\left(\beta, \alpha \right)} \left\{\ell(\beta, \hat{\gamma},\hat{W}, \alpha, \hat{\zeta}) - \frac{\epsilon_\beta}{2} \|\beta^0\|^2 - \frac{\epsilon_\alpha}{2} \|\alpha\|^2\right\}\,. \]
\[ \left(\hat{W}, \hat{\alpha} \right) \leftarrow \arg\min_{(W,\alpha)\,:\,W\alpha = \hat{W}\hat{\alpha}} \frac{1}{2} \left( \epsilon_W \|W\|^2 + \epsilon_\alpha \|\alpha\|^2 \right)\,. \]