5.8 - Multinomial and bivariate normal
Introduction
- In this section we will study two “well-known” joint distributions
- Multinomial: multivariate extension of binomial
- Bivariate normal: bivariate extension of normal
Multinomial experiment
A multinomial experiment satisfies these properties:
- Experiment consists of \(n\) identical, independent trials.
- Outcome of each trial falls into one of \(k\) mutually exclusive classes/cells/buckets.
- Probability that each outcome falls into cell \(i\) is \(p_i\), \(i = 1,2,...,k\) with \(\sum_{i=1}^k p_i = 1\).
- Random variables of interest are \(Y_1\), \(Y_2\),…,\(Y_k\), with \(Y_i\) the number of trials for which the outcome falls into cell \(i\).
Multinomial pmf
The joint pmf of \((Y_1,Y_2,...,Y_k)\) is:
\[p(y_1,y_2,...,y_k) = P(Y_1 = y_1,Y_2=y_2,...,Y_k=y_k) = \frac{n!}{y_1!y_2!...y_k!}p_1^{y_1}p_2^{y_2}...p_k^{y_k},\]
with joint support:
\[y_i \in \{0,1,...,n\}; \sum_{i=1}^k y_i = n\]
Multinomial properties
- Marginally, \(Y_i \sim BIN(n, p_i)\)
- \(\Rightarrow E(Y_i) = np_i\)
- \(\Rightarrow Var(Y_i) = np_i(1-p_i)\)
- \(Cov(Y_i, Y_j) = -np_ip_j\)
Genetics example
- Suppose fruit flies contain a gene for body color
- \(A\): dominant allele = black
- \(a\): recessive allele = brown
- Flies inherit one allele from father and one from mother
- \(AA\), \(Aa\) = black; \(aa\) = brown
- Suppose \(A\) alleles occur with probability \(p\) and \(a\) with probability \(1-p\)
- If 60 fruit flies are sampled, let \((Y_1,Y_2,Y_3)\) be the number of flies with genotypes \((AA,Aa,aa)\) respectively.
\[(Y_1,Y_2,Y_3) \sim MULT(n=60, p^2, 2p(1-p), (1-p)^2)\]
- Suppose the \(A\) allele occurs 60% of the time. Find \(p(20,30,10)\).
Solving genetics example
\[(Y_1,Y_2,Y_3) \sim MULT(n=60, p^2, p(1-p), (1-p)^2)\]
- \(p=0.6\)
- Find \(p(20,30,10)\).
\[p(20,30,10) = \frac{60!}{20!30!10!} (0.36)^{20} (0.48)^{30}(0.16)^{10}\]
dmultinom(x = c(20,30,10), prob = c(.36, .48, .16))
Simulating \(N\) fruit fly samples
library(tidyverse)
(thousand_fly_samples <- rmultinom(1000, size = 60, prob = c(.36, .48, .16))
%>% t()
%>% data.frame()
%>% setNames(c('Y1','Y2','Y3'))
) %>% head
Y1 Y2 Y3
1 19 36 5
2 19 34 7
3 25 27 8
4 26 27 7
5 18 31 11
6 22 33 5
cov(thousand_fly_samples) %>% round(3)
Y1 Y2 Y3
Y1 13.490 -10.632 -2.859
Y2 -10.632 15.516 -4.885
Y3 -2.859 -4.885 7.744
library(GGally)
ggpairs(thousand_fly_samples,
lower = list(continuous = wrap("points",
alpha = 0.6,
position = position_jitter(width = 0.2,
height = 0.2)
)
)
) +
theme_classic(base_size = 1)
- Which \((Y_i, Y_j)\) pairs are most negatively correlated, and why?
Bivariate normal
- If \((X,Y) \sim BVN(\mu_X,\mu_Y,\sigma_X^2,\sigma_Y^2,\rho)\), then:
\[\small f(x,y) = \frac{1}{2\pi\sigma_X\sigma_Y\sqrt{1-\rho^2}} e^{-\frac{1}{2(1-\rho^2)}\left[\left(\frac{x-\mu_X}{\sigma_X}\right)^2 + \left(\frac{y-\mu_Y}{\sigma_Y}\right)^2 - \frac{2\rho(x-\mu_X)(y-\mu_Y)}{\sigma_X\sigma_Y}\right]},x, y \in \mathbb{R}\]
- Bivariate standard normal:
\[f(x,y) = \frac{1}{2\pi\sqrt{1-\rho^2}} e^{-\frac{x^2-2\rho xy + y^2}{2(1-\rho^2)}}, x, y \in \mathbb{R}\]
Properties of bivariate normal
- \(\rho = Cor(X,Y) = 0\) if and only if \(X\) and \(Y\) are independent!
- \(X\) and \(Y\) are marginally normally distributed
- For standard bivariate normal:
\[Y|X=x\sim N(\rho x, 1-\rho^2)\] \[X|Y=y \sim N(\rho y, 1-\rho^2)\]
- For full bivariate normal:
\[Y|X=x \sim N\left(\mu_Y + \rho\frac{\sigma_Y}{\sigma_X} (x-\mu_X), \sigma_Y^2 (1-\rho^2)\right)\]
Simulating bivariate normal data
- Suppose we want to simulate \(N\) realizations of \((X,Y)\sim BVN(\mu_X, \mu_Y, \sigma^2_X,\sigma^2_Y, \rho)\) for a grid of parameter values.
- Can use the following relationships. Given \(Z_1 \sim N(0,1)\) and \(Z_2 \sim N(0,1)\), with \(Z_1\) and \(Z_2\) independent:
\[X_{std} = Z_1\] \[Y_{std} = \rho Z_1 + \sqrt{1-\rho^2}Z_2\] \[X = \sigma_X \cdot X_{std}+ \mu_X\] \[Y = \sigma_Y \cdot Y_{std}+ \mu_Y\]
Simulating over a grid
- We can use these relationships to simulate over parameter grid using familiar techniques to get a data set like this:
# A tibble: 6 × 10
sigx sigy rho .trial z1 z2 x_std y_std x y
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0.5 0.5 -0.8 1 -0.776 1.52 -0.776 1.54 4.61 5.77
2 0.5 0.5 -0.8 2 0.187 -2.52 0.187 -1.66 5.09 4.17
3 0.5 0.5 -0.8 3 -0.911 -0.727 -0.911 0.292 4.54 5.15
4 0.5 0.5 -0.8 4 -0.0697 1.82 -0.0697 1.15 4.97 5.57
5 0.5 0.5 -0.8 5 -1.39 -0.988 -1.39 0.520 4.30 5.26
6 0.5 0.5 -0.8 6 0.208 0.0933 0.208 -0.111 5.10 4.94
Plotting over a grid
- Given data shown, we can create the following:
library(ggh4x)
ggplot(data = bvnorm_sim) +
geom_point(aes(x = x, y = y), alpha = 0.5, size = .5) +
geom_point(aes(x = 5, y = 5), col='red') +
facet_nested(rho~sigx+sigy, labeller = label_both) +
theme_bw()