Introduction

When dealing with jointly distributed random variables, we are sometimes interested in the probability distribution function for the individual random variables.

Marginal Distributions

Let \(X\) and \(Y\) be a pair of jointly distributed random variables. In this context, the probability distribution function of the random variable \(X\) is called its marginal distribution function and can be found from the joint probability distribution by summing over the values of \(Y\).
\[p(x) = \sum_y p(x,y)\] Similarly, the marginal distribution of \(Y\) can be found from the joint probability distribution by summing over the values of \(X\). This is just the probability distribution of \(Y\).

\[p(y) = \sum_x p(x,y)\]

Suppose \(X\) and \(Y\) have the following joint probability distribution

\(x=1\) \(x=2\) \(x=3\)
\(y=3\) \(0.1\) \(0.15\) \(0.3\)
\(y=4\) \(0.2\) \(0.05\) \(0.2\)

Then the marginal distribution of \(X\) is

\(x\) \(1\) \(2\) \(3\)
\(p(x)\) \(0.1+0.2=0.3\) \(0.15+0.05=0.2\) \(0.3+0.2=0.5\)

And the marginal distribution of \(Y\) is

\(y\) \(3\) \(4\)
\(p(y)\) \(0.1+0.15+0.3=0.0.55\) \(0.2+0.05+0.2=0.45\)

Find the Mean

To find the mean of \(X\) from the joint distribution of \(X\) and \(Y\)

\[\mu_x = E(X) = \sum_x xp(x) = \sum_x x \sum_y p(x,y) =\sum_x \sum_y xp(x,y)\]

Continuing with our example above, we can find the mean of \(X\) from the marginal distribution \(p(x)\)

\[\mu_x = E(X) = \sum_x x p(x) = 1p(1) + 2p(2) + 3p(3) = 1(0.3) + 2(0.2) + 3(0.5) = 2.2\]

or from the joint probability distribution \(p(x,y)\)

\[\mu_x = E(X) = \sum_{x=1}^3 \sum_{y=3}^4 xp(x,y) = 1p(1,3) + 1p(1,4) + 2p(2,3) + 2p(2,4) + 3p(3,3) + 3p(3,4)\\=1(0.1)+1(0.2)+2(0.15)+2(0.05)+3(0.3)+3(0.2)=2.2\]

Find the Variance

To find the variance of \(X\) from the joint distribution of \(X\) and \(Y\)

\[\sigma_x^2 = \sum_x (x-\mu_x)^2p(x) = \sum_x (x-\mu_x)^2 \sum_y p(x,y) =\sum_x \sum_y (x-\mu_x)^2p(x,y)\] Continuing with our example above, we can find the variance of \(X\) from the marginal distribution \(p(x)\)

\[\sigma_x^2 = \sum_x (x-\mu_x)^2 p(x) = (1-2.2)^2p(1) + (2-2.2)^2p(2) + (3-2.2)^2p(3) \\= (1-2.2)^2(0.3) + (2-2.2)^2(0.2) + (3-2.2)^2(0.5) = 0.76\]

or from the joint probability distribution \(p(x,y)\)

\[\sigma_x^2= \sum_{x=1}^3 \sum_{y=3}^4 (x-\mu_x)^2p(x,y) \\= (1-2.2)^2p(1,3) + (1-2.2)^2p(1,4) + (2-2.2)^2p(2,3) + (2-2.2)^2p(2,4) \\+ (3-2.2)^2p(3,3) + (3-2.2)^2p(3,4)\\=(1-2.2)^2(0.1)+(1-2.2)^2(0.2)+(2-2.2)^2(0.15)+(2-2.2)^2(0.05)\\+(3-2.2)^2(0.3)+(3-2.2)^2(0.2)=0.76\]

Find the Standard Deviation

To find the standard deviation of \(X\) from the joint distribution of \(X\) and \(Y\)

\[\sigma_x = \sqrt{\sigma_x^2}\]

Continuing our example, we find that the standard deviation of \(X\) is \(\sqrt{0.76} = 0.87\).

Example: The joint probability distribution for \(X\) and \(Y\) is

\[f(x,y) = \frac{2}{9}xy \text{ where } x=0.5,1 \text{ and } y=1,2\]

  • Verify that this is a valid probability distribution
  • Find the marginal distribution of \(Y\)
  • Find the mean of \(Y\)
  • Find the standard deviation of \(Y\)

  • It is a valid probability distribution since all the probabilities are between \(0\) and \(1\) and the probabilities sum to \(1\). The easiest way to see this is to make a table
\(x=0.5\) \(x=1\)
\(y=1\) \(\frac{2}{9} \cdot 0.5 \cdot 1=\frac{1}{9}\) \(\frac{2}{9}\cdot 1 \cdot 1=\frac{2}{9}\)
\(y=2\) \(\frac{2}{9} \cdot 0.5 \cdot 2=\frac{2}{9}\) \(\frac{2}{9} \cdot 1 \cdot 2=\frac{4}{9}\)
  • The marginal distribution of \(Y\) is found by summing the joint probability distribution over all values of \(X\) \[p(1) = \sum_{x=0.5}^{1} \frac{2}{9} \cdot x \cdot 1=\frac{2}{9} \cdot 0.5 \cdot 1 + \frac{2}{9} \cdot 1 \cdot 1 = \frac{3}{9}\] and \[p(2) = \sum_{x=0.5}^{1} \frac{2}{9} \cdot x \cdot 2=\frac{2}{9} \cdot 0.5 \cdot 2 + \frac{2}{9} \cdot 1 \cdot 2 = \frac{6}{9}\] Putting this all together we get the marginal distribution of \(Y\)
\(y\) \(1\) \(2\)
\(p(y)\) \(\frac{3}{9}\) \(\frac{6}{9}\)
  • The mean of \(Y\) can be found from its marginal distribution

\[\mu_y=\sum_y yp(y) = 1 \cdot \frac{3}{9} + 2 \cdot \frac{6}{9} = \frac{15}{9}\]

or from the joint distribution

\[\mu_y = \sum_{y=1}^{2} \sum_{x=0.5}^1 y p(x,y) = 1 \cdot \sum_{x=0.5}^{1} \frac{2}{9} x \cdot 1 + 2 \cdot \sum_{x=0.5}^{1} \frac{2}{9} x \cdot 2 \\=1 \cdot (\frac{2}{9} \cdot 0.5 \cdot 1 + \frac{2}{9} \cdot 1 \cdot 1) + 2 \cdot (\frac{2}{9} \cdot 0.5 \cdot 2 + \frac{2}{9} \cdot 1 \cdot 2) = \frac{15}{9} \]

To find the standard deviation of \(Y\) we must first find the variance. The easiest way to find this is from the marginal distribution of \(Y\)

\[\sigma_y^2 = \sum_{y=1}^{2} (y-\frac{15}{9})^2 p(y) = 1 \cdot (1 - \frac{15}{9})^2 \cdot \frac{3}{9} + 2 \cdot (2 - \frac{15}{9})^2 \cdot \frac{6}{9}=0.2222\] so the standard deviation of \(Y\) is \(\sqrt{0.2222} = 0.47\).

Can you find the marginal distribution of \(X\) along with its mean and standard deviation?

Continuous Case

The continuous case is almost the same as the discrete case with integrals replacing the sums. For example, if f(x,y) is the joint probability density function then the marginal distribution of y is \[f(y)=\int f(x,y) dx\]

However, this is beyond the scope of our course.