When dealing with jointly distributed random variables, we are sometimes interested in the probability distribution function for the individual random variables.
Let \(X\) and \(Y\) be a pair of jointly distributed random variables. In this context, the probability distribution function of the random variable \(X\) is called its marginal distribution function and can be found from the joint probability distribution by summing over the values of \(Y\).
\[p(x) = \sum_y p(x,y)\] Similarly, the marginal distribution of \(Y\) can be found from the joint probability distribution by summing over the values of \(X\). This is just the probability distribution of \(Y\).
\[p(y) = \sum_x p(x,y)\]
Suppose \(X\) and \(Y\) have the following joint probability distribution
| \(x=1\) | \(x=2\) | \(x=3\) | |
|---|---|---|---|
| \(y=3\) | \(0.1\) | \(0.15\) | \(0.3\) |
| \(y=4\) | \(0.2\) | \(0.05\) | \(0.2\) |
Then the marginal distribution of \(X\) is
| \(x\) | \(1\) | \(2\) | \(3\) |
|---|---|---|---|
| \(p(x)\) | \(0.1+0.2=0.3\) | \(0.15+0.05=0.2\) | \(0.3+0.2=0.5\) |
And the marginal distribution of \(Y\) is
| \(y\) | \(3\) | \(4\) |
|---|---|---|
| \(p(y)\) | \(0.1+0.15+0.3=0.0.55\) | \(0.2+0.05+0.2=0.45\) |
To find the mean of \(X\) from the joint distribution of \(X\) and \(Y\)
\[\mu_x = E(X) = \sum_x xp(x) = \sum_x x \sum_y p(x,y) =\sum_x \sum_y xp(x,y)\]
Continuing with our example above, we can find the mean of \(X\) from the marginal distribution \(p(x)\)
\[\mu_x = E(X) = \sum_x x p(x) = 1p(1) + 2p(2) + 3p(3) = 1(0.3) + 2(0.2) + 3(0.5) = 2.2\]
or from the joint probability distribution \(p(x,y)\)
\[\mu_x = E(X) = \sum_{x=1}^3 \sum_{y=3}^4 xp(x,y) = 1p(1,3) + 1p(1,4) + 2p(2,3) + 2p(2,4) + 3p(3,3) + 3p(3,4)\\=1(0.1)+1(0.2)+2(0.15)+2(0.05)+3(0.3)+3(0.2)=2.2\]
To find the variance of \(X\) from the joint distribution of \(X\) and \(Y\)
\[\sigma_x^2 = \sum_x (x-\mu_x)^2p(x) = \sum_x (x-\mu_x)^2 \sum_y p(x,y) =\sum_x \sum_y (x-\mu_x)^2p(x,y)\] Continuing with our example above, we can find the variance of \(X\) from the marginal distribution \(p(x)\)
\[\sigma_x^2 = \sum_x (x-\mu_x)^2 p(x) = (1-2.2)^2p(1) + (2-2.2)^2p(2) + (3-2.2)^2p(3) \\= (1-2.2)^2(0.3) + (2-2.2)^2(0.2) + (3-2.2)^2(0.5) = 0.76\]
or from the joint probability distribution \(p(x,y)\)
\[\sigma_x^2= \sum_{x=1}^3 \sum_{y=3}^4 (x-\mu_x)^2p(x,y) \\= (1-2.2)^2p(1,3) + (1-2.2)^2p(1,4) + (2-2.2)^2p(2,3) + (2-2.2)^2p(2,4) \\+ (3-2.2)^2p(3,3) + (3-2.2)^2p(3,4)\\=(1-2.2)^2(0.1)+(1-2.2)^2(0.2)+(2-2.2)^2(0.15)+(2-2.2)^2(0.05)\\+(3-2.2)^2(0.3)+(3-2.2)^2(0.2)=0.76\]
To find the standard deviation of \(X\) from the joint distribution of \(X\) and \(Y\)
\[\sigma_x = \sqrt{\sigma_x^2}\]
Continuing our example, we find that the standard deviation of \(X\) is \(\sqrt{0.76} = 0.87\).
Example: The joint probability distribution for \(X\) and \(Y\) is
\[f(x,y) = \frac{2}{9}xy \text{ where } x=0.5,1 \text{ and } y=1,2\]
| \(x=0.5\) | \(x=1\) | |
|---|---|---|
| \(y=1\) | \(\frac{2}{9} \cdot 0.5 \cdot 1=\frac{1}{9}\) | \(\frac{2}{9}\cdot 1 \cdot 1=\frac{2}{9}\) |
| \(y=2\) | \(\frac{2}{9} \cdot 0.5 \cdot 2=\frac{2}{9}\) | \(\frac{2}{9} \cdot 1 \cdot 2=\frac{4}{9}\) |
| \(y\) | \(1\) | \(2\) |
|---|---|---|
| \(p(y)\) | \(\frac{3}{9}\) | \(\frac{6}{9}\) |
\[\mu_y=\sum_y yp(y) = 1 \cdot \frac{3}{9} + 2 \cdot \frac{6}{9} = \frac{15}{9}\]
or from the joint distribution
\[\mu_y = \sum_{y=1}^{2} \sum_{x=0.5}^1 y p(x,y) = 1 \cdot \sum_{x=0.5}^{1} \frac{2}{9} x \cdot 1 + 2 \cdot \sum_{x=0.5}^{1} \frac{2}{9} x \cdot 2 \\=1 \cdot (\frac{2}{9} \cdot 0.5 \cdot 1 + \frac{2}{9} \cdot 1 \cdot 1) + 2 \cdot (\frac{2}{9} \cdot 0.5 \cdot 2 + \frac{2}{9} \cdot 1 \cdot 2) = \frac{15}{9} \]
To find the standard deviation of \(Y\) we must first find the variance. The easiest way to find this is from the marginal distribution of \(Y\)
\[\sigma_y^2 = \sum_{y=1}^{2} (y-\frac{15}{9})^2 p(y) = 1 \cdot (1 - \frac{15}{9})^2 \cdot \frac{3}{9} + 2 \cdot (2 - \frac{15}{9})^2 \cdot \frac{6}{9}=0.2222\] so the standard deviation of \(Y\) is \(\sqrt{0.2222} = 0.47\).
Can you find the marginal distribution of \(X\) along with its mean and standard deviation?
The continuous case is almost the same as the discrete case with integrals replacing the sums. For example, if f(x,y) is the joint probability density function then the marginal distribution of y is \[f(y)=\int f(x,y) dx\]
However, this is beyond the scope of our course.