Probability Lecture

Harold Nelson

4/4/2024

Basic Concepts of Probability

An event is a set of possible outcomes of an experiment.The probability of an event is what we expect its relative frequency to approach as we run the experiment a large number of times.These values can arise from several sources.

Probabilities are always numbers in the range \(0\) to \(1\).

Theory

If the elementary outcomes of an experiment are discrete and all equally likely, the theoretical probability of an event is defined as

\[\frac{\text{Number of elementary outcomes in the event}}{\text{Total number of elementary outcomes}}\]

As an example, consider the probability of getting a head when you flip a fair coin. There are two elementary outcomes of this experiment, a head, or a tail. Since the coin is fair, the outcomes are equally likely. There are two possible outcomes and only one is in our event. So, the probability is \(1/2\). In everyday language, you may hear “fifty-fifty.” To stick with the language of probability we say that the probability is \(.5\).

Dice

Using equally likely elementary outcomes with discrete experiments is clearly advantageous. Consider the probability of getting a sum of seven dots when rolling a pair of dice. Try to figure this out.

Answer

If we think of the elementary outcomes of this experiment as the numbers between two and twelve, the outcomes are not equally likely. There is only one way to get a sum of two and only one way to get sum of twelve, but there are many ways to get a sum of seven. There are 36 possible equally likely elemementary outcomes when we consider the orderd pairs of outcomes describing what happened to die 1 and what happened to die 2. These are \((2,5),(5,2),(3,4),(4,3),(6,1) \text{ and } (1,6)\). Since there are six of them, the probabiliy of getting a seven is \(6/36\), or \(1/6\).

Empirical

This is really just noting that you can describe some empirical facts using the language of probability. An example or two should help.

Example

Based on the Current Population Survey, about 60% of the US population over the age of 16 meets the definition of participating in the labor force. How can we think of this as an experiment?

Answer

We can say that if a member of the population over the age of 16 were picked at random, the probability of that person being a labor force participant is about .6.

Example

Based on data from the CDC, there were 2,515,458 deaths in the US in 2010. Of those the primary cause of death was heart disease in 596,577 cases. How can we think of this as an experiment?

Example

We can say that if we were to pick a 2010 death certificate at random the probability of the death having a primary cause of heart disease is .237.

Subjective Probability

This is really just using the language of probability to express a personal degree of belief. Here are some examples based on my own personal experiences.

Review of Basic Principles

Recall that events are subsets of the sample space. We can use the theory of operations on sets to write rules for calculating probabilities.

Addition Law - General Case

\[P(A\text{ or }B)=P(A)+P(B)-P(A\text{ and }B)\]

##Example

Draw a single card from a 52 card deck. What is the probability of a heart or an ace?

\[P(\text{Heart or Ace})=P(\text{Heart})+P(\text{Ace})-P(\text{Heart and Ace})\]

\[P(\text{Heart or Ace})=\frac{13}{52}+\frac{4}{52}-\frac{1}{52}=\frac{16}{52}=\frac{4}{13}\]

The subtraction of the third term is necessary because the first two terms would have counted the ace of hearts twice.

Addition Law - Special Case

If the two events are mutually exclsuive (disjoint), the third term may be omitted since its value is zero. \[P(A\text{ or }B)=P(A)+P(B)\] ## Example

Draw a single card from a 52 card deck. What is the probability of a heart or a spade?
\[P(\text{Heart or Ace})=P(\text{Heart})+P(\text{Spade})\]

\[P(\text{Heart or Spade})=\frac{13}{52}+\frac{13}{52}=\frac{26}{52}=\frac{1}{2}\] The simpler version is applicable because no card can be both a heart and a spade. We say that these two events are disjoint or mutually exclusive.

The Complements Law

If A is an event, either it occcurs or does not occur whenever the experiment we are thinking of occurs. If we think of events as subsets of a sample space, then every outcome is in the subset A or not in the subset A. The set of all outcomes not in the event A is called the complement of A, sometimes denoted \(A^{c}\). No outcome is both in A and in \(A^{c}\) (mutually exclusive). Also, every outcome is in one of these (exhaustive). In set theoretic terms \(A \cup A^{c} = \text{Sample Space}\).

The practical consequence is that \[P(\text{A}) + P(\text{not A}) = 1.\]

Example

What is the probability of not getting heart when you draw a single card from a 52 card deck?

\[P(\text{not Heart}) = 1 - P(\text{Heart}) = 1 -\frac{13}{52} = \frac{3}{4}\]

Multiplication Law - General Case

\[P(\text{A and B}) = P(\text{A})*P(\text{B given that A has occured}).\] The phrasing in the second expression is generally replaced with a single vertical bar, ‘|’ between the two event names. The law is then written as \[P(\text{A and B}) = P(\text{A})*P(\text{B|A}).\]

Example

What is the probability of drawing two hearts if you draw two cards from a 52 card deck without replacement. There are two events involved, getting a heart on the first draw (H1) and getting a heart on the second draw (H2).

\[P(\text{H1 and H2}) = P(\text{H1})*P(\text{H1|H2})\]

\[P(\text{H1 and H2}) = \frac{13}{52}*\frac{12}{51}=\frac{3}{51}=\frac{1}{17}\]

Multiplication Law - Special Case

\[P(\text{A and B}) = P(\text{A})*P(\text{B}).\] If the probability that B will occur does not depend on whether A has occurred or not, we say that the two events are independent. In this case, we use the simple probability rather than the conditional probability.

Example

What is the probability of drawing two hearts if you draw two cards from a 52 card deck with replacement.

Answer

There are two events involved, getting a heart on the first draw (H1) and getting a heart on the second draw (H2). However, when we include the stipulation “with replacement,” the condition of the deck on the second draw is the same as it was on the first draw.

\[P(\text{H1 and H2}) = \frac{13}{52}*\frac{13}{52}=\frac{1}{4} *\frac{1}{4}=\frac{1}{16}\]

Bayes Law

In the description of the multiplication law above, we have assumed that A occurs first and may (or may not) influence the probability that a second event, B occurs. We imagine that an observer, having observed A, obtains an improved estimate of the probability that B will occur. In some circumstances, it makes sense to reverse our thinking and become more like a detective. If B has been observed, what is the probability that A occurred?

Write two equivalent expressions for \(P(\text{A and B}).\) \[P(\text{A and B}) = P(\text{A})*P(\text{B|A}) = P(\text{B}) *P(\text{A|B}).\]

Now solve the equation relating the second and third equations for \(P(\text{A|B})\).

\[P(\text{A})*P(\text{B|A}) = P(\text{B}) *P(\text{A|B})\]

\[P(\text{A|B}) = \frac{P(\text{A})*P(\text{B|A})}{P(\text{B})}\]

In many cases, we don’t know the probability of B directly. However we can construct it. If B occured, either A occurred and then B occured or; A did not occur and then B occurred anyway. These two possibilities are mutually exclusive and exhaustive. Therefore we can write

\[P(\text{B}) = P(\text{A}) * P(\text{B|A})+P(\text{ not A}) *P(\text{B| not A})\]

Example Return to the problem of drawing two hearts from a 52 card deck without replacement. Suppose the second card was a heart. What is the probability that the first card was also a heart?

\[P(\text{H1|H2}) = \frac{P(\text{H1})*P(\text{H2|H1})}{P(\text{H2})}\]

Now rewrite the denominator

\[P(\text{H1|H2}) = \frac{P(\text{H1})*P(\text{H2|H1})}{P(\text{H1})*P(\text{H2|H1})+P(\text{not H1})*P(\text{H2 | not H1})}\]

Now let’s insert the numerical values.

\[P(\text{H1|H2}) = \frac{\frac{13}{52}*\frac{12}{51}}{\frac{13}{52}*\frac{12}{51}+\frac{39}{52}*\frac{13}{51}}\]

CrossTabulations and Probability

If a collection of items can be classified in two ways, a crosstabulation is the standard means of showing how the two categorizations are related.

One example is based on the following experiment. Contact a member of the US population and ask two questions.

  1. What is your religion?
  2. What part of the country do you live in?

The following table summarizes the results.

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  2849 
## 
##  
##                 | gss_sm$bigregion 
## gss_sm$religion | Northeast |   Midwest |     South |      West | Row Total | 
## ----------------|-----------|-----------|-----------|-----------|-----------|
##      Protestant |       158 |       325 |       650 |       238 |      1371 | 
##                 |    24.877 |     0.149 |    44.346 |    14.194 |           | 
##                 |     0.115 |     0.237 |     0.474 |     0.174 |     0.481 | 
##                 |     0.324 |     0.471 |     0.624 |     0.377 |           | 
##                 |     0.055 |     0.114 |     0.228 |     0.084 |           | 
## ----------------|-----------|-----------|-----------|-----------|-----------|
##        Catholic |       162 |       172 |       160 |       155 |       649 | 
##                 |    23.502 |     1.397 |    25.093 |     0.882 |           | 
##                 |     0.250 |     0.265 |     0.247 |     0.239 |     0.228 | 
##                 |     0.333 |     0.249 |     0.154 |     0.246 |           | 
##                 |     0.057 |     0.060 |     0.056 |     0.054 |           | 
## ----------------|-----------|-----------|-----------|-----------|-----------|
##          Jewish |        27 |         3 |        11 |        10 |        51 | 
##                 |    38.340 |     7.080 |     3.128 |     0.149 |           | 
##                 |     0.529 |     0.059 |     0.216 |     0.196 |     0.018 | 
##                 |     0.055 |     0.004 |     0.011 |     0.016 |           | 
##                 |     0.009 |     0.001 |     0.004 |     0.004 |           | 
## ----------------|-----------|-----------|-----------|-----------|-----------|
##            None |       112 |       157 |       170 |       180 |       619 | 
##                 |     0.362 |     0.335 |    13.953 |    13.426 |           | 
##                 |     0.181 |     0.254 |     0.275 |     0.291 |     0.217 | 
##                 |     0.230 |     0.228 |     0.163 |     0.285 |           | 
##                 |     0.039 |     0.055 |     0.060 |     0.063 |           | 
## ----------------|-----------|-----------|-----------|-----------|-----------|
##           Other |        28 |        33 |        50 |        48 |       159 | 
##                 |     0.025 |     0.788 |     1.129 |     4.641 |           | 
##                 |     0.176 |     0.208 |     0.314 |     0.302 |     0.056 | 
##                 |     0.057 |     0.048 |     0.048 |     0.076 |           | 
##                 |     0.010 |     0.012 |     0.018 |     0.017 |           | 
## ----------------|-----------|-----------|-----------|-----------|-----------|
##    Column Total |       487 |       690 |      1041 |       631 |      2849 | 
##                 |     0.171 |     0.242 |     0.365 |     0.221 |           | 
## ----------------|-----------|-----------|-----------|-----------|-----------|
## 
## 

Exercise

What is the probability that a randomly selected person is catholic and lives in the south?

Answer

.054

Exercise

If a randomly selected person lives in the south, What is the probability that the person is catholic?

Answer

.154

Exercise

If a randomly selected person is catholic, what is the probability that the person lives in the south?

Answer

.247

Exercise

Which of these is more common?

  1. A catholic living in the south.
  2. A person with no religion living in the west.

Answer

The person with no religion living in the west. .063 against .056.