5.1 - bivariate discrete distributions
Motivation
Consider the data set below, examples of which you have hopefully seen many times:
age smoker married gestation bwt
1 30 N married 38 3540
2 30 N married 39 3090
3 23 Y single 34 2522
4 32 N married 38 3690
5 26 N married 41 3062
6 31 N married 40 3856
7 36 N married 41 4451
8 27 N single 42 3685
9 22 Y single 38 3118
10 34 N married 38 3152
This is an example of multivariate data
\(Y_1\) : Normal?
\(Y_2\) , \(Y_3\) : Bernoulli?
\(Y_4\) , \(Y_5\) : Inverted gamma?
In this example, each observation is a length\(-p\) vector, with \(p=5\)
Multivariate relationships
The “one-by-one” distributions are marginal distributions
In statistics of course, we usually care not just about univariate distributions, but about multivariate distributions
These multivariate distributions are often called joint distributions
Multivariate discrete
Let \(\{Y_1,...,Y_p\}\in \mathbb{R}^p\) be a \(p-\) vector of discrete random variables.
Then the joint probability mass function for \(\{Y_1,...,Y_p\}\) is:
\[p(y_1,y_2,...,y_p) = P(Y_1 = y_1, Y_2 = y_2,...,Y_p = y_p), y_1 \in \mathbb{R}, ..., y_p \in \mathbb{R}\]
Bivariate discrete
To study mathematical properties we will for the most part stay simple, and let \(p=2\)
We will use \(X \equiv Y_1\) and \(Y\equiv Y_2\) , for simplicity of notation.
Joint pmf properties (bivariate case) :
\(p(x,y) \geq 0\) for any \(x,y \in \mathbb{R}\) .
\(\sum_{x}\sum_{y}p(x,y) = 1\) (sum is over all \(x,y\) values such that \(p(x,y) > 0\) .)
Joint CDF
The joint cumulative distribution function (CDF) is defined as a straightforward extension of the univariate case.
For any random variables \(X\) and \(Y\) , the joint bivariate distribution function \(F\) is:
\[F(x,y) = P(X\leq x, Y \leq y); x,y \in \mathbb{R}\]
Properties:
\(\lim_{x\rightarrow -\infty}F(x,y) =\lim_{y\rightarrow -\infty}F(x,y) = 0\)
\(\lim_{x\rightarrow \infty}\lim_{y\rightarrow \infty}F(x,y) =1\)
Bivariate non-decreasing property (cumulative probability over a rectangular region is \(\geq 0\) ): If \(x < x^*\) and \(y < y^*\) , then \(F(x^*, y^*) - F(x^*,y)-F(x,y^*)+F(x,y)= P(x < X \leq x^*, y < Y \leq y^*) \geq 0\)
Visual of Property 3
Bivariate non-decreasing property: If \(x < x^*\) and \(y < y^*\) , then \(F(x^*, y^*) - F(x^*,y)-F(x,y^*)+F(x,y)= P(x < X \leq x^*, y < Y \leq y^*) \geq 0\)
Marginal
If \(X\) and \(Y\) have joint pmf \(p(x,y)\) , then the marginal (aka unconditional ) pmfs are given by:
\[p_X(x) = P(X=x) = \sum_{all\ y} p(x,y)\]
\[p_Y(y) = P(Y=y) = \sum_{all\ x} p(x,y)\]
It’s the subscript on these functions that’s important!
\(p_X(y)\) makes sense!
Just “the marginal probability that \(X\) takes on specific value \(y\) ”
Plug value \(y\) into function \(p_X\) .
Conditional
If \(X\) and \(Y\) have joint pmf \(p(x,y)\) and marginal pmfs \(p_X\) and \(p_Y\) , respectively, then the conditional discrete probability function is:
\[p_{X|Y}(x|y) = P(X=x|Y=y) = \frac{p(x,y)}{p_Y(y)} =\frac{P(X=x, Y=y)}{P(Y=y)},\] we analogously define \(p_{Y|X}(y|x) = p(x,y)/p_X(x)\) .
Note from here we can find a joint by combining conditional and marginal:
\[p(x,y) = p_{X|Y}(x|y) p_Y(y)\] \[p(x,y) = p_{Y|X}(y|x) p_X(x)\]
Example
Suppose \(X\) and \(Y\) have joint pmf given by:
\[p(x,y) = \left\{\begin{array}
{ll}
\frac{1}{30}(x+y) & x=0,1,2; y=0,1,2,3 \\
0 & otherwise\\
\end{array}\right.\]
0
0/30
1/30
2/30
1
1/30
2/30
3/30
2
2/30
3/30
4/30
3
3/30
4/30
5/30
Verify that this is a valid joint pmf
\[\sum_{x=0}^2 \sum_{y=0}^3\frac{1}{30}(x+y)\] \[= \sum_{x=0}^2 \left(\frac{x+0}{30} + \frac{x+1}{30} + \frac{x+2}{30} + \frac{x+3}{30}\right)\]
\[=\left(\frac{0+0}{30} + \frac{0+1}{30} + \frac{0+2}{30} + \frac{0+3}{30}\right)\] \[+\left(\frac{1+0}{30} + \frac{1+1}{30} + \frac{1+2}{30} + \frac{1+3}{30}\right)\] \[+\left(\frac{2+0}{30} + \frac{2+1}{30} + \frac{2+2}{30} + \frac{2+3}{30}\right)\]
Joint CDF
\(F(1,1) = \frac{0+0}{30} + \frac{1+0}{30} + \frac{0+1}{30} + \frac{1+1}{30}\)
Marginal of \(X\)
\[p_X(0) = \sum_{y=0}^3 p(0,y) =\left(\frac{0+0}{30} + \frac{0+1}{30} + \frac{0+2}{30} + \frac{0+3}{30}\right) =\frac{6}{30}\] \[p_X(1) = \sum_{y=0}^3 p(1,y) =\left(\frac{1+0}{30} + \frac{1+1}{30} + \frac{1+2}{30} + \frac{1+3}{30}\right) =\frac{10}{30}\] \[p_X(2) = \sum_{y=0}^3 p(2,y) =\left(\frac{2+0}{30} + \frac{2+1}{30} + \frac{2+2}{30} + \frac{2+3}{30}\right) =\frac{14}{30}\]
\(P(X=x)\)
6/30
10/30
14/30
Conditional \(p_{Y|X}\)
\(P(X=x)\)
6/30
10/30
14/30
0
0/30
1/30
2/30
1
1/30
2/30
3/30
2
2/30
3/30
4/30
3
3/30
4/30
5/30
\(p_{Y|0} = P(Y=y|X=0)\) :
0
0/30/(6/30) = 0/6
1
1/30/(6/30) = 1/6
2
2/30/(6/30) = 2/6
3
3/30/(6/30) = 3/6
\(p_{Y|1} = P(Y=y|X=1)\) :
0
1/30/(10/30) = 1/10
1
2/30/(10/30) = 2/10
2
3/30/(10/30) = 3/10
3
4/30/(10/30) = 4/10
Marginal CDFs from joint CDF
If \(F(x,y)\) is a joint CDF, then:
\[F_X(x) = P(X\leq x) =\lim_{y\rightarrow \infty} F(x,y)\] \[F_Y(y) = P(Y\leq y)= \lim_{x\rightarrow \infty} F(x,y)\]
Should make sense: \(P(X\leq x) = P(X \leq x, Y<\infty)\)
Marginal CDFs for example
0
0/30
1/30
2/30
1
1/30
2/30
3/30
2
2/30
3/30
4/30
3
3/30
4/30
5/30
\[F_X(0) = \frac{0+1+2+3}{30} = \frac{6}{30}\]
\[F_X(1) = \frac{6}{30}+\frac{1+2+3+4}{30} = \frac{16}{30}\]
\[F_X(2) = \frac{16}{30}+\frac{2+3+4+5}{30} = 1\]
\[F_X(x)= \begin{cases} 0& x < 0 \\
\frac{6}{30} & 0 \leq x < 1\\
\frac{16}{30} & 1 \leq x < 2\\
1 & x \geq 2
\end{cases}\]
\[F_Y(y)= \begin{cases} 0& y < 0 \\
\frac{3}{30} & 0 \leq y < 1\\
\frac{9}{30} & 1 \leq y < 2\\
\frac{18}{30} & 2 \leq y < 3\\
1 & y \geq 3
\end{cases}\]
Dart example
A bag has 6 boxes.
Three boxes have 3 darts,
Two of them have 4 darts,
One box has 5 darts.
A player picks a box at random, then shoot all the darts in the box at a target.
Suppose the player can hit the target 60% of the time.
Let \(X\) be the number of darts in the box he picks, and \(Y\) the number of times the player hits the target.
Find the joint pmf and conditional \(p_{X|Y}\) .
Joint pmf
Note we have a marginal and a conditional:
\[p_X(x) = \frac{6-x}{6}, x = 3, 4, 5\]
\[Y|X=x \sim BIN(x, 0.6)\]
Getting R to work for us:
library (purrrfect)
library (tidyverse)
(parameters (~ x, ~ y,
c (3 ,4 ,5 ), 0 : 5
)
)
# A tibble: 18 × 2
x y
<dbl> <int>
1 3 0
2 3 1
3 3 2
4 3 3
5 3 4
6 3 5
7 4 0
8 4 1
9 4 2
10 4 3
11 4 4
12 4 5
13 5 0
14 5 1
15 5 2
16 5 3
17 5 4
18 5 5
Joint pmf
Note we have a marginal and a conditional:
\[p_X(x) = \frac{6-x}{6}, x = 3, 4, 5\]
\[Y|X=x \sim BIN(x, 0.6)\]
Getting R to work for us:
library (purrrfect)
library (tidyverse)
(parameters (~ x, ~ y,
c (3 ,4 ,5 ), 0 : 5
)
%>% mutate (p_X = (6 - x)/ 6 ,
p_Y_given_X = dbinom (y, x, 0.6 ),
p_XY = p_X* p_Y_given_X
)
)
# A tibble: 18 × 5
x y p_X p_Y_given_X p_XY
<dbl> <int> <dbl> <dbl> <dbl>
1 3 0 0.5 0.064 0.032
2 3 1 0.5 0.288 0.144
3 3 2 0.5 0.432 0.216
4 3 3 0.5 0.216 0.108
5 3 4 0.5 0 0
6 3 5 0.5 0 0
7 4 0 0.333 0.0256 0.00853
8 4 1 0.333 0.154 0.0512
9 4 2 0.333 0.346 0.115
10 4 3 0.333 0.346 0.115
11 4 4 0.333 0.130 0.0432
12 4 5 0.333 0 0
13 5 0 0.167 0.0102 0.00171
14 5 1 0.167 0.0768 0.0128
15 5 2 0.167 0.230 0.0384
16 5 3 0.167 0.346 0.0576
17 5 4 0.167 0.259 0.0432
18 5 5 0.167 0.0778 0.0130
Joint pmf
Note we have a marginal and a conditional:
\[p_X(x) = \frac{6-x}{6}, x = 3, 4, 5\]
\[Y|X=x \sim BIN(x, 0.6)\]
Getting R to work for us:
library (purrrfect)
library (tidyverse)
(parameters (~ x, ~ y,
c (3 ,4 ,5 ), 0 : 5
)
%>% mutate (p_X = (6 - x)/ 6 ,
p_Y_given_X = dbinom (y, x, 0.6 ),
p_XY = p_X* p_Y_given_X
)
%>% pivot_wider (names_from = y,
values_from = p_XY,
names_prefix = 'y=' ,
id_cols = x)
)
# A tibble: 3 × 7
x `y=0` `y=1` `y=2` `y=3` `y=4` `y=5`
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 3 0.032 0.144 0.216 0.108 0 0
2 4 0.00853 0.0512 0.115 0.115 0.0432 0
3 5 0.00171 0.0128 0.0384 0.0576 0.0432 0.0130
Marginal \(p_Y\) and conditional \(p_{X|Y}\)
\[p_Y(y) = \sum_{x=3}^5 p(x,y)\]
\[p_{X|Y}(x|y) = \frac{p(x,y)}{p_Y(y)}\]
Conditional must sum to 1 for fixed \(y\) :
(parameters (~ x, ~ y,
c (3 ,4 ,5 ), 0 : 5
)
%>% mutate (p_X = (6 - x)/ 6 ,
p_Y_given_X = dbinom (y, x, 0.6 ),
p_XY = p_X* p_Y_given_X
)
%>% mutate (p_Y = sum (p_XY), .by = y)
%>% mutate (p_X_given_Y = p_XY/ p_Y)
)
# A tibble: 18 × 7
x y p_X p_Y_given_X p_XY p_Y p_X_given_Y
<dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 3 0 0.5 0.064 0.032 0.0422 0.758
2 3 1 0.5 0.288 0.144 0.208 0.692
3 3 2 0.5 0.432 0.216 0.370 0.584
4 3 3 0.5 0.216 0.108 0.281 0.385
5 3 4 0.5 0 0 0.0864 0
6 3 5 0.5 0 0 0.0130 0
7 4 0 0.333 0.0256 0.00853 0.0422 0.202
8 4 1 0.333 0.154 0.0512 0.208 0.246
9 4 2 0.333 0.346 0.115 0.370 0.312
10 4 3 0.333 0.346 0.115 0.281 0.410
11 4 4 0.333 0.130 0.0432 0.0864 0.500
12 4 5 0.333 0 0 0.0130 0
13 5 0 0.167 0.0102 0.00171 0.0422 0.0404
14 5 1 0.167 0.0768 0.0128 0.208 0.0615
15 5 2 0.167 0.230 0.0384 0.370 0.104
16 5 3 0.167 0.346 0.0576 0.281 0.205
17 5 4 0.167 0.259 0.0432 0.0864 0.500
18 5 5 0.167 0.0778 0.0130 0.0130 1
Marginal \(p_Y\) and conditional \(p_{X|Y}\)
\[p_Y(y) = \sum_{x=3}^5 p(x,y)\]
\[p_{X|Y}(x|y) = \frac{p(x,y)}{p_Y(y)}\]
Conditional must sum to 1 for fixed \(y\) :
(parameters (~ x, ~ y,
c (3 ,4 ,5 ), 0 : 5
)
%>% mutate (p_X = (6 - x)/ 6 ,
p_Y_given_X = dbinom (y, x, 0.6 ),
p_XY = p_X* p_Y_given_X
)
%>% mutate (p_Y = sum (p_XY), .by = y)
%>% mutate (p_X_given_Y = p_XY/ p_Y)
%>% pivot_wider (names_from = x,
values_from = p_X_given_Y,
names_prefix = 'x=' ,
id_cols = y)
)
# A tibble: 6 × 4
y `x=3` `x=4` `x=5`
<int> <dbl> <dbl> <dbl>
1 0 0.758 0.202 0.0404
2 1 0.692 0.246 0.0615
3 2 0.584 0.312 0.104
4 3 0.385 0.410 0.205
5 4 0 0.500 0.500
6 5 0 0 1
Simulating
Simulate bivariate random variables often involves first simulating one from the marginal, then the other from the conditional.
Here, simulate \(X\) , then simulate \(Y|X=x\) .
N <- 10000
bag <- c (3 ,3 ,3 ,4 ,4 ,5 )
one_draw <- \() sample (bag, 1 )
(dart_sim <- replicate_int (N, one_draw (), .as = X)
%>% mutate (Y = map_int (X, \(x) rbinom (1 , size = x, prob = 0.6 )))
) %>% head
# A tibble: 6 × 3
.trial X Y
<dbl> <int> <int>
1 1 3 2
2 2 3 3
3 3 3 2
4 4 5 3
5 5 3 1
6 6 3 2
(dart_sim
%>% summarize (cnt = n (),
.by = c (X,Y))
%>% mutate (joint_prob = cnt/ N,
prob_X_given_Y = cnt/ sum (cnt), .by = Y)
)
# A tibble: 15 × 5
X Y cnt joint_prob prob_X_given_Y
<int> <int> <int> <dbl> <dbl>
1 3 2 2068 0.207 0.576
2 3 3 1092 0.109 0.386
3 5 3 563 0.0563 0.199
4 3 1 1445 0.144 0.685
5 5 2 376 0.0376 0.105
6 4 3 1176 0.118 0.415
7 4 1 537 0.0537 0.255
8 4 2 1145 0.114 0.319
9 5 1 128 0.0128 0.0607
10 4 0 86 0.0086 0.199
11 3 0 332 0.0332 0.769
12 5 5 145 0.0145 1
13 4 4 451 0.0451 0.505
14 5 4 442 0.0442 0.495
15 5 0 14 0.0014 0.0324
Plot of the simulated joint distribution
ggplot (data = dart_sim) +
geom_jitter (aes (x = X, y = Y), width= .1 , height= .3 , size = .4 , alpha = .8 ) +
scale_x_continuous (breaks = 3 : 5 ) +
scale_y_continuous (breaks = 0 : 5 ) +
theme_classic (base_size = 20 )