Harold Nelson
4/4/2024
An event is a set of possible outcomes of an experiment.The probability of an event is what we expect its relative frequency to approach as we run the experiment a large number of times.These values can arise from several sources.
Probabilities are always numbers in the range \(0\) to \(1\).
If the elementary outcomes of an experiment are discrete and all equally likely, the theoretical probability of an event is defined as
\[\frac{\text{Number of elementary outcomes in the event}}{\text{Total number of elementary outcomes}}\]
As an example, consider the probability of getting a head when you flip a fair coin. There are two elementary outcomes of this experiment, a head, or a tail. Since the coin is fair, the outcomes are equally likely. There are two possible outcomes and only one is in our event. So, the probability is \(1/2\). In everyday language, you may hear “fifty-fifty.” To stick with the language of probability we say that the probability is \(.5\).
Using equally likely elementary outcomes with discrete experiments is clearly advantageous. Consider the probability of getting a sum of seven dots when rolling a pair of dice. Try to figure this out.
If we think of the elementary outcomes of this experiment as the numbers between two and twelve, the outcomes are not equally likely. There is only one way to get a sum of two and only one way to get sum of twelve, but there are many ways to get a sum of seven. There are 36 possible equally likely elemementary outcomes when we consider the orderd pairs of outcomes describing what happened to die 1 and what happened to die 2. These are \((2,5),(5,2),(3,4),(4,3),(6,1) \text{ and } (1,6)\). Since there are six of them, the probabiliy of getting a seven is \(6/36\), or \(1/6\).
This is really just noting that you can describe some empirical facts using the language of probability. An example or two should help.
Based on the Current Population Survey, about 60% of the US population over the age of 16 meets the definition of participating in the labor force. How can we think of this as an experiment?
We can say that if a member of the population over the age of 16 were picked at random, the probability of that person being a labor force participant is about .6.
Based on data from the CDC, there were 2,515,458 deaths in the US in 2010. Of those the primary cause of death was heart disease in 596,577 cases. How can we think of this as an experiment?
We can say that if we were to pick a 2010 death certificate at random the probability of the death having a primary cause of heart disease is .237.
This is really just using the language of probability to express a personal degree of belief. Here are some examples based on my own personal experiences.
I believe that the probability of having at least one class cancelled during a winter term because of snow is about .5.
I believe that the probability of my laptop having a hardware failure during the next 6 months is less than .05.
Recall that events are subsets of the sample space. We can use the theory of operations on sets to write rules for calculating probabilities.
\[P(A\text{ or }B)=P(A)+P(B)-P(A\text{ and }B)\]
##Example
Draw a single card from a 52 card deck. What is the probability of a heart or an ace?
\[P(\text{Heart or Ace})=P(\text{Heart})+P(\text{Ace})-P(\text{Heart and Ace})\]
\[P(\text{Heart or Ace})=\frac{13}{52}+\frac{4}{52}-\frac{1}{52}=\frac{16}{52}=\frac{4}{13}\]
The subtraction of the third term is necessary because the first two terms would have counted the ace of hearts twice.
If the two events are mutually exclsuive (disjoint), the third term may be omitted since its value is zero. \[P(A\text{ or }B)=P(A)+P(B)\] ## Example
Draw a single card from a 52 card deck. What is the probability of a
heart or a spade?
\[P(\text{Heart or
Ace})=P(\text{Heart})+P(\text{Spade})\]
\[P(\text{Heart or Spade})=\frac{13}{52}+\frac{13}{52}=\frac{26}{52}=\frac{1}{2}\] The simpler version is applicable because no card can be both a heart and a spade. We say that these two events are disjoint or mutually exclusive.
If A is an event, either it occcurs or does not occur whenever the experiment we are thinking of occurs. If we think of events as subsets of a sample space, then every outcome is in the subset A or not in the subset A. The set of all outcomes not in the event A is called the complement of A, sometimes denoted \(A^{c}\). No outcome is both in A and in \(A^{c}\) (mutually exclusive). Also, every outcome is in one of these (exhaustive). In set theoretic terms \(A \cup A^{c} = \text{Sample Space}\).
The practical consequence is that \[P(\text{A}) + P(\text{not A}) = 1.\]
What is the probability of not getting heart when you draw a single card from a 52 card deck?
\[P(\text{not Heart}) = 1 - P(\text{Heart}) = 1 -\frac{13}{52} = \frac{3}{4}\]
\[P(\text{A and B}) = P(\text{A})*P(\text{B given that A has occured}).\] The phrasing in the second expression is generally replaced with a single vertical bar, ‘|’ between the two event names. The law is then written as \[P(\text{A and B}) = P(\text{A})*P(\text{B|A}).\]
What is the probability of drawing two hearts if you draw two cards from a 52 card deck without replacement. There are two events involved, getting a heart on the first draw (H1) and getting a heart on the second draw (H2).
\[P(\text{H1 and H2}) = P(\text{H1})*P(\text{H1|H2})\]
\[P(\text{H1 and H2}) = \frac{13}{52}*\frac{12}{51}=\frac{3}{51}=\frac{1}{17}\]
\[P(\text{A and B}) = P(\text{A})*P(\text{B}).\] If the probability that B will occur does not depend on whether A has occurred or not, we say that the two events are independent. In this case, we use the simple probability rather than the conditional probability.
What is the probability of drawing two hearts if you draw two cards from a 52 card deck with replacement.
There are two events involved, getting a heart on the first draw (H1) and getting a heart on the second draw (H2). However, when we include the stipulation “with replacement,” the condition of the deck on the second draw is the same as it was on the first draw.
\[P(\text{H1 and H2}) = \frac{13}{52}*\frac{13}{52}=\frac{1}{4} *\frac{1}{4}=\frac{1}{16}\]
In the description of the multiplication law above, we have assumed that A occurs first and may (or may not) influence the probability that a second event, B occurs. We imagine that an observer, having observed A, obtains an improved estimate of the probability that B will occur. In some circumstances, it makes sense to reverse our thinking and become more like a detective. If B has been observed, what is the probability that A occurred?
Write two equivalent expressions for \(P(\text{A and B}).\) \[P(\text{A and B}) = P(\text{A})*P(\text{B|A}) = P(\text{B}) *P(\text{A|B}).\]
Now solve the equation relating the second and third equations for \(P(\text{A|B})\).
\[P(\text{A})*P(\text{B|A}) = P(\text{B}) *P(\text{A|B})\]
\[P(\text{A|B}) = \frac{P(\text{A})*P(\text{B|A})}{P(\text{B})}\]
In many cases, we don’t know the probability of B directly. However we can construct it. If B occured, either A occurred and then B occured or; A did not occur and then B occurred anyway. These two possibilities are mutually exclusive and exhaustive. Therefore we can write
\[P(\text{B}) = P(\text{A}) * P(\text{B|A})+P(\text{ not A}) *P(\text{B| not A})\]
Example Return to the problem of drawing two hearts from a 52 card deck without replacement. Suppose the second card was a heart. What is the probability that the first card was also a heart?
\[P(\text{H1|H2}) = \frac{P(\text{H1})*P(\text{H2|H1})}{P(\text{H2})}\]
Now rewrite the denominator
\[P(\text{H1|H2}) = \frac{P(\text{H1})*P(\text{H2|H1})}{P(\text{H1})*P(\text{H2|H1})+P(\text{not H1})*P(\text{H2 | not H1})}\]
Now let’s insert the numerical values.
\[P(\text{H1|H2}) = \frac{\frac{13}{52}*\frac{12}{51}}{\frac{13}{52}*\frac{12}{51}+\frac{39}{52}*\frac{13}{51}}\]
If a collection of items can be classified in two ways, a crosstabulation is the standard means of showing how the two categorizations are related.
One example is based on the following experiment. Contact a member of the US population and ask two questions.
The following table summarizes the results.
##
##
## Cell Contents
## |-------------------------|
## | N |
## | Chi-square contribution |
## | N / Row Total |
## | N / Col Total |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 2849
##
##
## | gss_sm$bigregion
## gss_sm$religion | Northeast | Midwest | South | West | Row Total |
## ----------------|-----------|-----------|-----------|-----------|-----------|
## Protestant | 158 | 325 | 650 | 238 | 1371 |
## | 24.877 | 0.149 | 44.346 | 14.194 | |
## | 0.115 | 0.237 | 0.474 | 0.174 | 0.481 |
## | 0.324 | 0.471 | 0.624 | 0.377 | |
## | 0.055 | 0.114 | 0.228 | 0.084 | |
## ----------------|-----------|-----------|-----------|-----------|-----------|
## Catholic | 162 | 172 | 160 | 155 | 649 |
## | 23.502 | 1.397 | 25.093 | 0.882 | |
## | 0.250 | 0.265 | 0.247 | 0.239 | 0.228 |
## | 0.333 | 0.249 | 0.154 | 0.246 | |
## | 0.057 | 0.060 | 0.056 | 0.054 | |
## ----------------|-----------|-----------|-----------|-----------|-----------|
## Jewish | 27 | 3 | 11 | 10 | 51 |
## | 38.340 | 7.080 | 3.128 | 0.149 | |
## | 0.529 | 0.059 | 0.216 | 0.196 | 0.018 |
## | 0.055 | 0.004 | 0.011 | 0.016 | |
## | 0.009 | 0.001 | 0.004 | 0.004 | |
## ----------------|-----------|-----------|-----------|-----------|-----------|
## None | 112 | 157 | 170 | 180 | 619 |
## | 0.362 | 0.335 | 13.953 | 13.426 | |
## | 0.181 | 0.254 | 0.275 | 0.291 | 0.217 |
## | 0.230 | 0.228 | 0.163 | 0.285 | |
## | 0.039 | 0.055 | 0.060 | 0.063 | |
## ----------------|-----------|-----------|-----------|-----------|-----------|
## Other | 28 | 33 | 50 | 48 | 159 |
## | 0.025 | 0.788 | 1.129 | 4.641 | |
## | 0.176 | 0.208 | 0.314 | 0.302 | 0.056 |
## | 0.057 | 0.048 | 0.048 | 0.076 | |
## | 0.010 | 0.012 | 0.018 | 0.017 | |
## ----------------|-----------|-----------|-----------|-----------|-----------|
## Column Total | 487 | 690 | 1041 | 631 | 2849 |
## | 0.171 | 0.242 | 0.365 | 0.221 | |
## ----------------|-----------|-----------|-----------|-----------|-----------|
##
##
What is the probability that a randomly selected person is catholic and lives in the south?
.054
If a randomly selected person lives in the south, What is the probability that the person is catholic?
.154
If a randomly selected person is catholic, what is the probability that the person lives in the south?
.247
Which of these is more common?
The person with no religion living in the west. .063 against .056.