June 27, 2017
Motivation
The ZINB-WaVE model
Estimation procedure
Comparison with existing approaches
Wagner, Regev, Yosef (2016)
For any \(\mu\geq 0\) and \(\theta>0\), the probability mass function (PMF) of the negative binomial (NB) distribution is
\[ f_N(y;\mu,\theta) = \frac{\Gamma(y+\theta)}{\Gamma(y+1)\Gamma(\theta)} \left(\frac{\theta}{\theta+\mu}\right)^{\theta} \left(\frac{\mu}{\mu + \theta}\right)^y, \quad \forall y\in\mathbb{N}. \]
The mean of the NB distribution is \(\mu\) and its variance is: \[ \sigma^2 = \mu + \frac{\mu^2}{\theta}. \] In particular, the NB distribution boils down to a Poisson distribution when \(\theta = +\infty\).
For any \(\pi\in[0,1]\), the PMF of the zero-inflated negative binomial (ZINB) distribution is given by
\[ f(y;\mu,\theta, \pi) = \pi \delta_0(y) + (1 - \pi) f_N(y;\mu,\theta), \quad \forall y\in\mathbb{N}, \]
where \(\delta_0(\cdot)\) is the Dirac function.
Here, \(\pi\) can be interpreted as the probability that a \(0\) is observed instead of the actual count, resulting in an inflation of zeros compared to the NB distribution, hence the name ZINB.
Given \(n\) samples and \(J\) genes, let \(Y_{ij}\) denote the count of gene \(j\) (for \(j=1,\ldots,J\)) for sample \(i\) (for \(i=1,\ldots,n\)).

To estimate the parameters \((\alpha, \beta, \gamma, W, \theta)\), we follow a penalized maximum likelihood approach, to reduce overfitting and improve the numerical stability of the optimization problem in the setting of many parameters.
\[ \max_{\beta,\gamma,W,\alpha,\zeta} \left\{\ell(\beta,\gamma,W,\alpha,\zeta) - \text{Pen}(\beta,\gamma,W,\alpha,\zeta) \right\}\,, \]
where \(\zeta = \log \theta\) and
\[ \text{Pen}(\beta,\gamma,W,\alpha,\zeta) = \frac{\epsilon_{\beta}}{2} \|\beta^0\|^2 + \frac{\epsilon_{\gamma}}{2} \|\gamma^0\|^2 + \frac{\epsilon_{W}}{2}\|W\|^2 + \frac{\epsilon_{\alpha}}{2}\|\alpha\|^2 + \frac{\epsilon_{\zeta}}{2}\text{Var}(\zeta) \,. \]
The penalized likelihood is not concave, making its maximization computationally challenging.
We instead find a local maximum, starting from a smart initialization and iterating a numerical optimization scheme until local convergence.
\[ \hat{\zeta} \leftarrow \arg\max_{\zeta} \left\{\ell(\hat{\beta}, \hat{\gamma}, \hat{W}, \hat{\alpha}, \zeta) - \frac{\epsilon_\zeta}{2} \text{Var}(\zeta) \right\} \,. \]
\[ \left(\hat{\gamma}, \hat{W} \right) \leftarrow \arg\max_{\left(\gamma, W\right)} \left\{\ell(\hat{\beta}, \gamma, W, \hat{\alpha}, \hat{\zeta}) - \frac{\epsilon_\gamma}{2} \|\gamma^0\|^2 - \frac{\epsilon_W}{2} \|W\|^2 \right\}\,. \]
\[ \left(\hat{\beta}, \hat{\alpha} \right) \leftarrow \arg\max_{\left(\beta, \alpha \right)} \left\{\ell(\beta, \hat{\gamma},\hat{W}, \alpha, \hat{\zeta}) - \frac{\epsilon_\beta}{2} \|\beta^0\|^2 - \frac{\epsilon_\alpha}{2} \|\alpha\|^2\right\}\,. \]
\[ \left(\hat{W}, \hat{\alpha} \right) \leftarrow \arg\min_{(W,\alpha)\,:\,W\alpha = \hat{W}\hat{\alpha}} \frac{1}{2} \left( \epsilon_W \|W\|^2 + \epsilon_\alpha \|\alpha\|^2 \right)\,. \]



qcpca <- prcomp(qc, scale. = TRUE)
qcmod <- model.matrix(~qcpca$x[,1:2]) head(qcmod)
## (Intercept) qcpca$x[, 1:2]PC1 qcpca$x[, 1:2]PC2 ## 1 1 -2.0656901 0.53766716 ## 2 1 1.8332237 -0.11599891 ## 3 1 1.2944881 -0.03171802 ## 4 1 0.5874294 0.87552522 ## 5 1 -2.0671591 1.12292102 ## 6 1 -2.1200918 0.94022293
zinbres <- zinbFit(filtered, K = 3, X = qcmod)
ZINB-Wave preprint: doi.org/10.1101/125112
zinbwave package: bioconductor.org/packages/zinbwave
These slides: rpubs.com/daviderisso