1 Intro to Probability

Question: What is the difference between counting and finding probability?

1.1 Classical Definition of Probability

When measuring discrete (whole number) outcomes, the classical definition of a probability of a desired outcome $A$ is:

\[P(A) = \frac{ \text{number of outcomes in which A occurs}}{\text{total number of possible outcomes}}\]

Quiz Question: If you roll one die, what is the probability of getting a 6?

Quiz Question: If you roll two dice, what is the probability of getting a sum of 6?

1.2 Complementary events

Sometimes probabilities are easier to calculate if we look at their complement.

The complement of an event $A$ is the event “$A$ doesn’t happen.” The notation $A^c$ is used for the complement of event $A$. We can compute the probability of the complement using: \[P(A^c) = 1 - P(A)\]

Note: The complement of $A^c$ is the original event $A$, so that \[P(A) = 1 - P(A^c)\]

Quiz Question: If you roll one die, what is the probability of not getting a 6?

Quiz Question: If you roll two dice, what is the probability of not getting a sum of 6?

If you pull a random card from a deck of playing cards, what is the probability it is not a heart?

2 Counting Definitions

2.1 Factorial

Example 1 - How many different ways are there to line up 5 students?

5*4*3*2*1

## [1] 120

Factorial:

\[n! = n(n-1)(n-2) ... 3 \cdot 2 \cdot 1\]

n = 5
factorial(n)

## [1] 120

Quiz Question:
How many different ways can you arrange red, blue, green, and yellow balls in a line?

2.2 Permutations

Example: - How many ways different ways can I pick 3 students from a class of 20 and put them in a row?

20*19*18

## [1] 6840

Permutations: The number of ways $r$ items may be selected from among $n$ choices (without replacement) when order matters is:

\[_n P_r = n(n-1)(n-2) ... (n-r+1)\] \[ = \frac{n(n-1)(n-2) ... (n-r+1)}{1} \cdot \frac{(n-r)(n-r-1) ... 3 \cdot 2 \cdot 1}{(n-r)(n-r-1) ... 3 \cdot 2 \cdot 1} \] \[ = \frac{n!}{(n-r)!}\]

NOTE: Many calculators have the function ‘nPr’ for the number of permutations. Thus, an easier way to calculate $\frac{n!}{(n-r)!}$ is simply: $nPr(n,r)$.

If you use R, here is some code that makes your own nPr function:

# Fast permute function
nPr = function(n,k) {
  sum = n
  for ( i in 1:(k-1) ) {
    sum = sum*(n-i) 
  }
  sum
}
# Example computation
nPr(20,3)

## [1] 6840

Quiz Question: How many ways can a four-person executive committee (president, vice-president, secretary, treasurer) be selected from a 16-member board of directors of a non-profit organization?

Eight sprinters have made it to the Olympic finals in the 100-meter race. In how many different ways can the gold, silver, and bronze medals be awarded?

2.3 Combinations

Example 3 - How many ways can a four-person committee be selected from 16 members where all committee members have equal positions?

There are $_{16} P_{4}=43680$ ways to choose the members where the order matters. If it doesn’t, we overcounted. By how much? For any 4 member selection, there are $4!$ ways to order those 4 members, thus we overcounted by a factor of $4!$

# Example 3
43680/factorial(4)

## [1] 1820

Combinations: The number of ways $r$ items may be selected from among $n$ choices (without replacement) when order does NOT matter is: \[_n C_r = \frac{_n P_r}{r!} = \frac{n!}{r!(n-r)!}\] This is also denoted $\binom{n}{r}$ and referred to as “n choose r.”

So we could be fancy and calculate it like this:

# Example 3
n = 16
r = 4
factorial(n)/( factorial(r) * factorial(n-r) )

## [1] 1820

Or even this!

# Example 3
choose(16,4)

## [1] 1820

NOTE: Many calculators have the function ‘nCr’ for the number of combinations Thus, an easier way to calculate $\frac{n!}{(n-r)!r!}$ is simply: $nCr(n,r)$.

Quiz Question:
A group of four students is to be chosen from a 35-member class to represent the class on the student council. How many ways can this be done?

The United States Senate Appropriations Committee consists of 29 members; the Defense Subcommittee of the Appropriations Committee consists of 19 members. Disregarding party affiliation or any special seats on the Subcommittee, how many different 19-member subcommittees may be chosen from among the 29 Senators on the Appropriations Committee?

2.4 Birthdays

Consider the likelihood of ANY two people from a class of 60 students having the same birthday.

Would you bet $10 on this?

To find the chance of ANY two people in this class having the same birthday, let’s break this up into easier problems as follows.

Question: What is the probability that, given just two people selected, they have the same birthdays?

If there are $365$ different birthdays:

the number of ways people could have the same birthday is $365$
the number of possible ways two people could have birthdays is $365*365$

\[P(\text{two given people have the same birthday}) = \frac{365}{365^2} = \frac{1}{365}\]

1/365

## [1] 0.002739726

What is the probability that, given 3 people selected, all have different birthdays?

Let $A$ = “no 2 people out of 3 have the same birthday”: Then $A^c$ = “at least 2 people out of 3 share the same birthday, so:

$P(\text{at least 2 people out of the 3 have the same birthday})$ \[= 1-P(\text{no 2 people out of 3 have the same birthday})\]

$P(\text{no 2 people share the same birthday})$ \[=\frac{\text{number of different ways 3 people could have all different birthdays}}{\text{total ways 3 people could have birthdays}}\]

The probability that no 2 out of 3 people share the same birthday:

365*364*363/365^3

## [1] 0.9917958

What is the probability that, given 3 people, at least one pair of people share the same birthday?
What is the probability that, given 5 people, at least one pair of people the same birthday?

Hint: $P(\text{no 2 people out of 5 share same birthday})=\frac{_{365} P_{5} }{365^{5}}$

What is the probability that in a class of 60 students, at least one pair of people share the same birthday?

Hint: $P(\text{no 2 people out of 60 share same birthday}) = \frac{_{365} P_{60} }{365^{60}}$

3 Probabilites for Multiple Events

3.1 Independent Events

Two events are independent if the outcome of one does not affect the probability of the other. If events A and B are independent, then the probability of both $A$ and $B$ occurring is \[P(A \text{ and } B) = P(A) \cdot P(B)\] where $P(A \text{ and } B)$ is the probability of events $A$ and $B$ both occurring, $P(A)$ is the probability of event A occurring, and $P(B)$ is the probability of event $B$ occurring.

A brewery utilizes two bottling machines, but they do not operate simultaneously. The second machine acts as a backup system to the first machine and operates only when the first breaks down during operating hours.

The probability that the first machine breaks down during operating hours is 0.20.
If the first breaks down, then the second machine is turned on and has a probability of 0.30 of breaking down.

What is the probability that the brewery’s bottling system is not working during operating hours?
The reliability of the bottling process is the probability that the system is working during operating hours. Find the reliability of the bottling process at the brewery.

If A is the event that the first machine is broken and B is the event that the second machine is broken, the probability both are broken is: \[P(A \text{ and } B) = P(A) \cdot P(B)\] (If we assume A and B are independent events.)

# a.

Find the probability that at least one bottling system is working (reliability) is the complement of this answer.

# b.

3.2 Multiple Events Occurring Simultaneously

Suppose we flipped a coin and rolled a die, and wanted to know the probability of getting a head on the coin or a 6 on the die.

Here, there are still 12 possible outcomes: {H1,H2,H3,H4,H5,H6,T1,T2,T3,T4,T5,T6}

How many outcomes have heads?

How many outcomes have a 6?

How many outcomes have heads OR a 6?

How many outcomes have heads AND a 6?

The probability of either of two events occurring simultaneously is \[P(A \text{ or } B)=P(A)+P(B)-P(A \text{ and } B)\].

Question: Suppose you draw one card from a standard card deck. What is the probability of getting a club or a face card?

Quiz Question: Suppose you draw one card from a standard card deck. What is the probability of getting a spade or an ace?

3.3 Mutually Exclusive Events

Two events are mutually exclusive if they cannot happen at the same time, so $P(A \text{ and } B) = 0$. If A and B are mutually exclusive, then \[P(A \text{ or } B) = P(A) + P(B)\]

Question: If you roll a 6-sided die, what is the probability of rolling an even number?

Suppose we draw one card from a standard card deck. What is the probability that we get a Queen or a King?
Suppose we draw one card from a standard card deck. What is the probability that we get a spade or a King?

3.4 Guessing on a Quiz (Optional)

A multiple-choice question on a quiz contains 5 questions, each with four possible answers (A, B, C, D). Compute the probability of randomly guessing the answers and getting 100% on the quiz (all five questions correct).

Question 1:
What is the total number of possible ways you could respond to this test?

Question 2:
What is the probability of getting all correct?

Question 3:
What is the probability of guessing and getting 0 questions correct (all incorrect)?

Question 4.1:
What is the probability of guessing and getting 1 question correct?

Question 4.2:
What is the probability of guessing and getting 2 questions correct?

Question 4.3:
What is the probability of guessing and getting 3 questions correct?

Question 4.4
What is the probability of guessing and getting 4 questions correct?

a = c()
a[1] = 3^5/4^5
a[2] = choose(5,1)*3^4/4^5
a[3] = choose(5,2)*3^3/4^5
a[4] = choose(5,3)*3^2/4^5
a[5] = choose(5,4)*3^1/4^5
a[6] = 1/4^5
names(a) = paste( "P(", c(0:5), " correct)", sep="")
a

## P(0 correct) P(1 correct) P(2 correct) P(3 correct) P(4 correct) P(5 correct) 
## 0.2373046875 0.3955078125 0.2636718750 0.0878906250 0.0146484375 0.0009765625

sum(a)

## [1] 1

Question:
On the quiz, what is the probability of getting a “D” or higher (at least 3 out of 5 answers correct)?

P(getting 3, 4, or 5 correct) = P(3 correct) + P(4 correct) + P(5 correct})

4 Expected Value

Expected value provides a way of evaluating the value of a decision with multiple outcomes.

Expected Value is defined as the average gain or loss of an event if the procedure is repeated many times. We can compute the expected value by multiplying each outcome by the probability of that outcome, and then adding up the products.

For mutually exclusive events, A and B, the expected value is:

$P(A)\cdot V(A) + P(B)\cdot V(B)$,

where $V(A)$ and $V(B)$ represent the value of $A$ and $B$ respectively, with a gain as a positive value and loss as a negative value.

For $n$ disjoint events $A_1, A_2, ... A_n$ for which $P(A_1) + P(A_2) + ... + P(A_n)=1$, the expected value is:

$P(A_1)\cdot V(A_1) + P(A_2) \cdot V(A_2) + ... + P(A_n)\cdot V(A_n)$.

Example: You purchase a raffle ticket to help out a charity. The raffle ticket costs $5. The charity is selling 2000 tickets. One of them will be drawn and the person holding the ticket will be given a prize worth $4000. Compute the expected value for this raffle.

If your ticket is drawn, you net $4000-$5 = $3995. The probability of this is 1/2000.
If your ticket is not drawn, you net -$5. The probability of this is 1999/2000.

So the expected value is:

# Example 8
3995*1/2000 + -5*1999/2000

## [1] -3

On average, each person is giving about $3.00 to charity.

An insurance company estimates the probability of an earthquake in the next year to be 0.0013. The average damage done by an earthquake is estimated to be $60,000. If the company offers earthquake insurance for $100, what is the expected value of the policy?
A friend offers to play a game, in which you roll 3 standard 6-sided dice. If all the dice roll different values, you give him $1. If any two dice match values, you get $2. What is the expected value of this game? Would you play?

P_lose = 6*5*4/6^3
(P_lose)*(-1) + (1-P_lose)*(2)

## [1] 0.3333333

5 Conditional probability

The probability that event B occurs, given that event A has happened is represented by $P(B|A)$, read “the probability of B given A.”

Conditional probabilities can be used to find the probability of joint events, even when they are not independent:

\[P(A \text{ and } B) = P(A|B) \cdot P(B)\]

Consider the following events:

A = having power in the morning
B = the event of having a class that day

##          Class No Class Sum
## Power       90        6  96
## No Power     0        4   4
## Sum         90       10 100

What is the probability of having class?

Question:
What is the probability of not having class?

Question:
What is the probability of having class given that there was no power when you wake up?

Question:
What is the probability of having class given that there was power when you wake up?

Quiz Question: Given that there was class, what is the probability that there was power?

Quiz Question:
Given that there was no class, what is the probability there was no power?

6 Bayes’ Rule

By the the formula for calculating join probabilities (for events that are not dependent) both \[P(A \text{ and } B) = P(A|B) \cdot P(B)\] and \[P(B \text{ and } A) = P(B|A) \cdot P(A)\] By setting these equal we get a way to “invert” conditional probabilities: \[P(A|B) \cdot P(B)=P(B|A) \cdot P(A)\] OR

\[P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}\]

If we only know $P(A)$ and $P(B|A)$, we can find $P(B)$ because:

if A occurs, the probability of B occurring is $P(B|A) \cdot P(A)$.
if A does not occur, the probability of B occurring is $P(B|A^c) \cdot P(A^c)$.

This accounts of all the ways $B$ can occur, so \[P(B) = P(B|A) \cdot P(A) + P(B|A^c) \cdot P(A^c)\].

Now, plugging in $P(B)$ gives us Bayes’ Rule!

Bayes’ Rule: Given two events $A$ and $B$,

\[ P(A|B) = \frac{P(B|A) \cdot P(A)} { P(B|A) \cdot P(A) + P(B|A^c) \cdot P(A^c)} \]

6.1 Disease #1

Example 9 - A new test has been devised to detect a new disease:

The disease has an incidence rate of 0.1%: it afflicts 0.1% of the population;
The test produces no false negatives: everyone who has the disease will test positive; but,
The false positive rate is 5%: of those who do not have the disease, 5% will test positive.

Should doctors use this test?

Well, suppose you test negative for the disease. Great, that means you don’t have it! But if you test positive, what is the probability that you actually have the disease?

Let’s first label the following events:

A = having the disease
B = testing positive

So we want to know $P($having the disease|testing positive$) =P(A|B)$.

What information do we know?

-$P($having the disease$)=P(A)=$

-$P($not having disease$=P(A^c)=$

-$P($testing negative|have the disease$)=P(B^c|A)=$

-$P($testing positive|have the disease$)=P(B|A)=$

-$P($testing positive|not having the disease$)=P(B|A^c)=$

-$P($testing negative|not having the disease$)=P(B^c|A^c)=$

Using Bayes’ Rule:

\[ P(A|B) = \frac{P(B|A) \cdot P(A)} { P(B|A) \cdot P(A) + P(B|A^c) \cdot P(A^c)} \]

# Example 9
(1*0.001)/(1*0.001+0.05*0.999)

## [1] 0.01962709

This shows that only about 2% of the people who test positive for this disease using this test will actually have it!

6.2 Disease #2

A new test has been devised to detect a new disease:

The incidence rate of 2% (it afflicts 2% of the population);
The produces false negatives at a rate of 0.5% (of those who have the disease, 0.5% will test negative); and
The false positive rate is 1% (of those who do not have the disease, 1% of people who test positive).

Again, label the following events:

A = having the disease
B = testing positive

If you test positive, what is the probability of actually having the disease?
If you test negative, what is the probability of not having the disease?

Bayes’ rule can be used for this if we replace $A$ with $A^c$, replace $B$ with $B^c$, and realize that the complement of $A^c$ is just $A$.

\[ P(A^c|B^c) = \frac{P(B^c|A^c) \cdot P(A^c)} { P(B^c|A^c) \cdot P(A^c) + P(B^c|A) \cdot P(A)} \]

7 Group Project Problems

In your subgroup, select a problem you want to work on. Work together toward finding a solution to the provided questions and/or any related questions you find interesting. In a 1-2 page Report, present:

the problem you worked on
a solution and the tools and reasoning you used to arrive at a solution
the significance of the result and how it can contribute toward better decision-making

Make sure you edit your Report to ensure it is readable with no grammar or spelling errors.

Each group will also make a short video presentation of their work, so keep in mind, that your work will be made public for other students to view and study.

7.1 Poker Odds

Compute the probability of randomly drawing five cards from a deck and getting:

a. a pair
b. three-of-a-kind
c. four of a kind
d. a full house (three of a kind and a pair)
e. a flush (all the same suit)

After you have answers your group is convinced of your answers, try checking your answers: https://en.wikipedia.org/wiki/Poker_probability

Suppose you have three of a kind. What is the probability that someone else has a higher hand? (you can use the probabilities given for a straight and straight flush on Wikipedia for answering this one.)

7.2 Playing the lottery

In a certain state’s lottery, $64$ balls numbered 1 through $64$ are placed in a machine and six of them are drawn at random. If the six numbers drawn match the numbers that a player had chosen, the player wins a jackpot of $1,000,000. If numbers drawn match any 5 of the numbers that a player had chosen, the player wins $1,000. It costs $1 to buy a ticket. Find the expected value.

Over time the jackpot will increase if no one wins. How large would the jackpot have to be for the expected value of playing the lottery to be positive? (In this case, would you still buy a lottery ticket?)

7.3 Group assignment

Suppose you had problems working with your group on last projects unit. Some were arguing and some were not contributing to the work on the group, so was decided that students will be randomly placed into 6 groups on the next project. Should you worry about being put into the same group??

Use the following questions to develop an answer to the question: If we were put to randomly groups again, what is the probability that YOU are placed in the exact same group?

How many different ways are there to place $n=30$ students into 6 groups? (Hint: consider an easier problem.)
1. How many different ways are there to arrange $5$ given students in a row from left to right?
2. How many different ways are there to arrange $5$ of the $30$ students in a row from left to right?
3. How many different ways are there to choose $5$ of $30$ students (order does not matter)?
4. How many different ways are there to place $10$ of the $30$ students into two uniquely labeled groups (GROUP1, GROUP2) each with $5$ students?
5. How many different ways are there to place $30$ students into 6 uniquely labeled groups: GROUP1 (5 students), GROUP2 (5 students), GROUP3 (5 students), etc.?
If we were to randomly assign groups again, what is the probability EVERYONE is placed in the exact same groups again?
If you are put in the same group, how many ways are there to assign everybody else? Use this to find the probability that, if we are randomly assigned into new groups, YOU are placed in the exact same group.

7.4 Shared birthday

Suppose two people meet. What is the probability that they share a birthday?
Suppose 3 people are in a room. What is the probability that there is at least one shared birthday among these 3 people?
Suppose 10 people are together. What is the probability that there is at least one shared birthday among these 10 people?
Suppose $n$ people are in a room. What is the probability that there is at least one shared birthday among these $n$ people?
How many people would need to be in a room until you would bet $100 on the event that at least two people share a birthday? (Use an Excel spreadsheet to find the probabilities and the expected value of the wager for $n=1,2,3,...,50$. The permute function in Excel is “=PERMUTE(n,k)”)

7.5 Racial Bias

In this problem, we will explore probabilities from a series of events.

If you flip 10 coins, how many would you expect to come up “heads” on average? Given your answer, would you expect every flip of 10 coins to come up with exactly that many heads?
If you were to flip 10 coins, what results would you consider “usual”? What results would you consider “unusual”?
When flipping 10 coins, what is the theoretic probability of flipping 10 heads?

Below is a simulation of flipping a coin 10 times, repeated 100,000 times. In the table below, the number of times 0,1,2,…,10 heads were displayed. Note the sum of all these values is 100,000.

outcomes = c("heads","tails")
trial = c()
for ( i in 1:100000 )
{
  sam = sample( outcomes, replace = T, 10)
  trial[i] = sum( sam=="heads" )
}
kable(table(trial), caption="Table of number of time out of 100,000 flips",
      col.names = c("Number of heads", "Number of trials"))

Table of number of time out of 100,000 flips
Number of heads	Number of trials
0	89
1	1001
2	4414
3	11756
4	20508
5	24751
6	20431
7	11544
8	4432
9	992
10	82

Based on this simulation, what appears to be the probability of flipping 0 heads, 1 head, … up to 10 heads?
If you were to flip 10 coins, based on the simulated data, what range of values would you consider “usual” results? What would you consider “unusual” results?

The formula \[_n C_k p^k (1-p)^{n-k}\] will compute the probability of an event with probability $p$ occurring $k$ times out of $n$, such as flipping $k=5$ heads out of $n=10$ coins where the probability of heads is $p=0.5$.

\[_n C_r = \frac{_n P_k}{k!} = \frac{n!}{k!(n-k)!}\] You can use the Excel formula “=COMBIN(n,k)” to calculate $_nC_R$.

Use this to compute the theoretic probability of flipping 5 “heads” out of 10 coins. Compare your answer to the probability found in 4.
Use this to compute the theoretic probability of flipping fewer than 2 “heads” out of 10 coins. Compare your answer to the probability found in 4.

Use this formula to consider a case from 1960. In the area, about 26% of the jury-eligible population was black. In the court case, there were 25 people on the juror panel, of which 2 were black.

If black people were selected for the panel with a probability of 26%, calculate the probability of there being 2 or fewer black people on the jury panel.
Does this provide evidence of racial bias in jury selection?

7.6 Intruder detection

An unmanned monitoring system uses high-tech video equipment and microprocessors to detect intruders. A prototype system has been developed and is in use outdoors at a weapons munitions plant. The system is designed to detect intruders with a probability of .90. However, the design engineers expect this probability to vary with the weather conditions. The system automatically records the weather conditions each time an intruder is detected. Based on a series of controlled tests, in which an intruder was released at the plant under various weather conditions, the following information is available: Given the intruder was, in fact, detected by the system, the weather was clear 75% of the time, cloudy 20% of the time, and raining 5% of the time. When the system failed to detect the intruder, 60% of the days were clear, 30% cloudy, and 10% rainy. Use this information to find the probability of detecting an intruder, given clear, cloudy, and rainy weather conditions (in the case that an intruder has been released at the plant). When is this system the most reliable? When is it the least reliable?

7.7 Correct Diagnosis?

Suppose a certain type of cancer has an incidence rate of 0.5% (that is, it afflicts 0.5% of the population). A new test has been devised to detect this cancer, which is very cheap and easy to administer in comparison to existing tests. The test produces false negatives at a rate of 1.4% (that is, 1.4% of those who have the disease will test negative), and the false positive rate is 1% (that is, about 1% of people who take the test will test positive, even though they do not have the disease). How accurate is this test?

Suppose a randomly selected person takes the test and tests positive. What is the probability that this person actually has the disease?
Suppose a randomly selected person takes the test and tests negative. What is the probability that this person does not have the disease?

Based on this, what recommendations would you make to doctors using this test?

Unit 3 Notes - Probability

Paul Regier