Applications of statistics are often concerned about the relationships between variables. For example, the percent return on two different stocks may tend to be related, and the returns for both may increase when the market is growing. Or consider a car dealer who has 3 cars for sale: a red two door compact, a blue minivan and a silver full size sedan. The probability distribution for purchasing these cars would not be the same for women in their 20’s, 30’s and 50’s. So it is important that probability models reflect the joint effect of variables on probabilities.
The joint probability distribution of two discrete random variables \(X\) and \(Y\) is \[p(x,y) = P(X = x \cap Y = y)\]
As before, the probabilities must be non-negative and sum to 1
The joint probability distribution can be expressed in the form of a table or by using a formula. For example, \[p(x,y) = \frac{xy}{18}\text{ for } x=1,2,3 \text{ and } y = 1,2\]
and
| \(x=1\) | \(x=2\) | \(x=3\) | |
|---|---|---|---|
| \(y=1\) | \(\frac{1}{18}\) | \(\frac{2}{18}\) | \(\frac{3}{18}\) |
| \(y=2\) | \(\frac{2}{18}\) | \(\frac{4}{18}\) | \(\frac{6}{18}\) |
So, for example, \(P(X = 1 \cap Y = 2) = \frac{2}{18}\) and we can check the required properties.
\[ p(x,y) \ge 0\] and \[ \sum_{x=1}^{3} \sum_{y=1}^{2} \frac{xy}{18} = \frac{1}{18} + \frac{2}{18} + \frac{3}{18} + \frac{2}{18} + \frac{4}{18} + \frac{6}{18}=1\]
Example 1:
A fair coin is tossed three times. Let \(X\) denote the number of heads on the first toss and \(Y\) denote the total number of heads. Find the joint probability distribution of \(X\) and \(Y\) and check that it is a valid probability distribution.
Click For AnswerStart by listing out every outcome along with its corresponding probability along with the values for each random variable
| outcome | \(x\) | \(y\) | probability |
|---|---|---|---|
| \(HHH\) | \(1\) | \(3\) | \(0.125\) |
| \(HHT\) | \(1\) | \(2\) | \(0.125\) |
| \(HTT\) | \(1\) | \(1\) | \(0.125\) |
| \(HTH\) | \(1\) | \(2\) | \(0.125\) |
| \(TTT\) | \(0\) | \(0\) | \(0.125\) |
| \(TTH\) | \(0\) | \(1\) | \(0.125\) |
| \(THT\) | \(0\) | \(1\) | \(0.125\) |
| \(THH\) | \(0\) | \(2\) | \(0.125\) |
We can see that \(X\) can take on the values \(0\) or \(1\) and \(Y\) can take on the values \(0\), \(1\), \(2\) or \(3\).
Furthermore, we can see for example that \(P(X=1 \cap Y=2) = P(HHT \cup HTH) = 0.125+0.125=0.25\) and \(P(X=1 \cap Y=0) = 0\) since we can’t have no heads \((Y=0)\) if we have a head on the first toss \((X=1)\).
The joint probability distribution between \(X\) and \(Y\) is given below:
| \(y=0\) | \(y=1\) | \(y=2\) | \(y=3\) | |
|---|---|---|---|---|
| \(x=0\) | \(0.125\) | \(0.25\) | \(0.125\) | \(0\) |
| \(x=1\) | \(0\) | \(0.125\) | \(0.25\) | \(0.125\) |
This is a valid probability distribution since the probabilites are all between \(0\) and \(1\) and the probabilities sum to \(1\).
Most of the concepts and formulas from above are analogous for the continuous case with integrals replacing the sums. A principal difference is in the definition of the joint distribution since the formula \(P(X=x,Y=y)\) is no longer valid. For this reason we define the joint cumulative distribution function of two continuous random variables \(X\) and \(Y\) to be \[F(x,y) = P(X < x \cap Y < y)\]
Most of this is beyond the scope of our course and is included here for completeness.