1 Preliminaries

1.1 Goals

Share with your group:

Your preferred name.
Where are you from?
What are you goals and expectations for this session?
What else are you doing this summer?

1.2 Software

Do the following (all free):

Download and install R https://cran.r-project.org/
Download and install Desktop RStudio https://rstudio.com/products/rstudio/download/ (choose the free version)

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows, and MacOS. RStudio is a desktop environment we will be using to run statistical computations.

I will use this software to present problems and computations for this session, so it will be an excellent resource to use and develop skills with during this session.

If for some reason you are unable to successfully install this software, no worries! You can still follow along, take notes and work on problems on your paper/document, doing all the same calculations using Desmos Scientific Calculator: https://www.desmos.com/scientific

Below, an example R code is shown in a grey box.

The output of the code is shown below in a white box.

## [1] 4

2 Some Data

##                  Full.Name Preferred                  Morning_Group
## 1          Alailima, Renae     Renae Group 1: Nutritional Chemistry
## 2             Bobba, Sohan     Sohan      Group 2: Exploring Nature
## 3      Chen, Joseph Howard    Joseph    Group 3: Psychology of Film
## 4               Kim, Ethan     Ethan    Group 3: Psychology of Film
## 5    Lee, Joshua Cheng Jun    Joshua      Group 2: Exploring Nature
## 6              Ma, Shelley   Shelley      Group 2: Exploring Nature
## 7           Marthi, Satvik    Satvik      Group 2: Exploring Nature
## 8         Purkanto, Alexia    Alexia      Group 2: Exploring Nature
## 9     Ramakrishnna, Devika    Devika      Group 2: Exploring Nature
## 10 Romero, Jally (Lizbeth)   Lizbeth      Group 2: Exploring Nature
## 11   Sanchez, Glen Alberto      Glen Group 1: Nutritional Chemistry
## 12        Shandilya, Arnav     Arnav Group 1: Nutritional Chemistry
## 13          Shintani, Ryan      Ryan    Group 3: Psychology of Film

2.1 Entrance into College

## 
## 2025 2026 2027 
##    6    4    3

2.2 State

## 
##  CA  KS  MO  NJ  PA  TX  VA Sum 
##   2   1   1   3   1   4   1  13

3 Intro to the birthday problem

What is the chance two people selected at random have the same birthday?

Group Question: Would you bet $20 on this?

What is the chance two people in this session have the same birthday?

Group Question: Would you bet $20 on this?

How many people would need to be in a room for you to bet $10 on two people having the same birthday?

4 Foundations of Probability

Question: What is the difference between counting and finding probability?

4.1 Definitions

When measuring discrete (whole number) outcomes, the probability of a desired outcome $A$ is written:

\[P(A) = \frac{ \text{number of outcomes in which A occurs}}{\text{total number of possible outcomes}}\]

Question: If I select one of you, what is the probability of selecting a person from Texas?

## 
##  CA  KS  MO  NJ  PA  TX  VA Sum 
##   2   1   1   3   1   4   1  13

Question: If you roll two dice, what is the probability of getting a sum of 6?

4.2 Complementary events

Sometimes probabilities are easier to calculate if we look at their complement.

The complement of an event $A$ is the event “$A$ doesn’t happen.” The notation $A^c$ is used for the complement of event $A$. We can compute the probability of the complement using: \[P(A^c) = 1 - P(A)\]

Note: The complement of $A^c$ is the original event $A$, so that \[P(A) = 1 - P(A^c)\]

Question: If I select one of you, what is the probability of selecting a person not from Texas?

## 
##  CA  KS  MO  NJ  PA  TX  VA Sum 
##   2   1   1   3   1   4   1  13

Question: If you roll two dice, what is the probability of not getting a sum of 6?

Question: If you pull a random card from a deck of playing cards, what is the probability it is not a heart?

5 Ways of Counting

Questions: How many different ways can you arrange the members of this session of you in a line?

5.1 Factorial

Factorial:

\[n! = n(n-1)(n-2) ... 3 \cdot 2 \cdot 1\]

## [1] 120

5.2 Permutations

Example: - How many ways different ways can I pick 4 students from this session and put them in a row?

Permutations: The number of ways $r$ items may be selected from among $n$ choices (without replacement) when order matters is:

\[_n P_r = n(n-1)(n-2) ... (n-r+1)\] \[ = \frac{n(n-1)(n-2) ... (n-r+1)}{1} \cdot \frac{(n-r)(n-r-1) ... 3 \cdot 2 \cdot 1}{(n-r)(n-r-1) ... 3 \cdot 2 \cdot 1} \] \[ = \frac{n!}{(n-r)!}\]

NOTE: Many calculators have the function ‘nPr’ for the number of permutations. Thus, an easier way to calculate $\frac{n!}{(n-r)!}$ is simply: $nPr(n,r)$.

Question: How many ways can a three-person executive committee (president, vice-president, secretary) be selected from a 16-member board of directors of a non-profit organization?

Question: Eight sprinters have made it to the Olympic finals in the 100-meter race. In how many different ways can the gold, silver, and bronze medals be awarded?

5.3 Combinations

Example - How many ways can we choose 3 members from this session to be a group?

Combinations: The number of ways $r$ items may be selected from among $n$ choices (without replacement) when order does NOT matter is: \[_n C_r = \frac{_n P_r}{r!} = \frac{n!}{r!(n-r)!}\] This is also denoted $\binom{n}{r}$ and referred to as “n choose r.”

So we could be fancy and calculate 16 and choose 4 like this:

## [1] 1820

Or even this!

## [1] 1820

NOTE: Many calculators have the function ‘nCr’ for the number of combinations Thus, an easier way to calculate $\frac{n!}{(n-r)!r!}$ is simply: $nCr(n,r)$.

Question:
A group of four students is to be chosen from a 35-member class to represent the class on the student council. How many ways can this be done?

Question: The United States Senate Appropriations Committee consists of 29 members; the Defense Subcommittee of the Appropriations Committee consists of 19 members. Disregarding party affiliation or any special seats on the Subcommittee, how many different 19-member subcommittees may be chosen from among the 29 Senators on the Appropriations Committee?

5.4 Solving the birthday problem

Suppose you are talking to a random person in this class. Consider the likelihood of you both having the same birthday.

Back to the birthday problem…

Let’s assume there are $365$ different birthdays:

What is the probability that, given two people selected, they have the same birthdays?
What is the probability that, given three people selected, all share the same birthday?
Consider the likelihood of ANY two people from this session having the same birthday.

Let’s consider a simpler problem:

If event $A$ = “two out of 3 people have the same birthday”,
the complement $A^c$ = “no two people out of three have the same birthday”:

$P(\text{two people out of the 3 have the same birthday})$ \[= 1-P(\text{no two people out of 3 have the same birthday})\]

$P(\text{no two people share the same birthday})$ \[=\frac{\text{number of different ways 3 people could have birthdays}}{\text{total ways 3 people could have birthdays}}\] P(different birthdays)

What is the probability that, given 3 people, any two of them share the same birthday?

P(at least 2 having same birthdays)

What is the probability that, given 5 people, any two of them share the same birthday?

P(no two having same birthday)

P(any two having same birthday)

What is the probability that any two people in this session have the same birthday?
How many people would need to be in a room for you to bet $10 on two people having the same birthday?

##      1      2      3      4      5      6      7      8      9     10     11 
## 0.0000 0.0027 0.0082 0.0164 0.0271 0.0405 0.0562 0.0743 0.0946 0.1169 0.1411 
##     12     13     14     15     16     17     18     19     20     21     22 
## 0.1670 0.1944 0.2231 0.2529 0.2836 0.3150 0.3469 0.3791 0.4114 0.4437 0.4757 
##     23     24     25     26     27     28     29     30     31     32     33 
## 0.5073 0.5383 0.5687 0.5982 0.6269 0.6545 0.6810 0.7063 0.7305 0.7533 0.7750 
##     34     35     36     37     38     39     40     41     42     43     44 
## 0.7953 0.8144 0.8322 0.8487 0.8641 0.8782 0.8912 0.9032 0.9140 0.9239 0.9329 
##     45     46     47     48     49     50 
## 0.9410 0.9483 0.9548 0.9606 0.9658 0.9704

6 Probabilites for Multiple Events

6.1 Independent Events

Two events are independent if the outcome of one does not affect the probability of the other. If events A and B are independent, then the probability of both $A$ and $B$ occurring is \[P(A \text{ and } B) = P(A) \cdot P(B)\] where $P(A \text{ and } B)$ is the probability of events $A$ and $B$ both occurring, $P(A)$ is the probability of event A occurring, and $P(B)$ is the probability of event $B$ occurring.

A brewery utilizes two bottling machines, but they do not operate simultaneously. The second machine acts as a backup system to the first machine and operates only when the first breaks down during operating hours.

The probability that the first machine breaks down during operating hours is 0.20.
If the first machine breaks down, then the second machine is turned on and has a probability of 0.30 of breaking down.

What is the probability that the brewery’s bottling system is not working during operating hours?
The reliability of the bottling process is the probability that the system is working during operating hours. Find the reliability of the bottling process at the brewery.

If A is the event that the first machine is broken and B is the event that the second machine is broken, the probability both are broken is: \[P(A \text{ and }) = P(A) \cdot P(B)\] (If we assume A and B are independent events.)

Find the probability that at least one bottling system is working (reliability) is the complement of this answer.

6.2 Multiple Events Occurring Simultaneously

Suppose we flipped a coin and rolled a die, and wanted to know the probability of getting a head on the coin or a 6 on the die.

List all the possible outcomes:

How many outcomes have heads?

How many outcomes have a 6?

How many outcomes have heads OR a 6?

How many outcomes have heads AND a 6?

Let $A$ represent the outcome has a heads.
Let $B$ represent the outcome has a 6.

Questions: Find $P(A \text{ or } B)$.

The probability of either of two events occurring simultaneously is \[P(A \text{ or } B)=P(A)+P(B)-P(A \text{ and } B)\].

Question: Suppose you draw one card from a standard card deck. What is the probability of getting a club or a face card?

Quiz Question: Suppose you draw one card from a standard card deck. What is the probability of getting a king or an ace?

6.3 Mutually Exclusive Events

Two events are mutually exclusive if they cannot happen at the same time, so $P(A \text{ and } B) = 0$. If A and B are mutually exclusive, then \[P(A \text{ or } B) = P(A) + P(B)\]

Question: If you roll a 6-sided die, what is the probability of rolling an even number?

Question: Suppose we draw one card from a standard card deck. What is the probability that we get a face card or a spade?

6.4 Guessing on a Quiz

A multiple-choice question on a quiz contains 5 questions, each with four possible answers (A, B, C, D). Compute the probability of randomly guessing the answers and getting a 100% on the quiz (all five questions correct).

Question 1:
What is the total number of possible ways you could respond to this test?

Question 2:
What is the probability of getting all correct?

Question 3:
What is the probability of guessing and getting 0 questions correct (all incorrect)?

Question 4.1:
What is the probability of guessing and getting 1 question correct?

Question 4.2:
What is the probability of guessing and getting 2 questions correct?

Question 4.3:
What is the probability of guessing and getting 3 questions correct?

Question 4.4:
What is the probability of guessing and getting 4 questions correct?

Question 5:
On the quiz, what is the probability of getting a “D” or higher (at least 3 out of 5 answers correct)?

P(getting 3, 4, or 5 correct) = P(3 correct) + P(4 correct) + P(5 correct})

## P(0 correct) P(1 correct) P(2 correct) P(3 correct) P(4 correct) P(5 correct) 
##       0.2373       0.3955       0.2637       0.0879       0.0146       0.0010

## [1] 1

7 Expected Value

Expected value provides a way of evaluating the value of a decision with multiple outcomes.

Expected Value is defined as the average gain or loss of an event if the procedure is repeated many times. We can compute the expected value by multiplying each outcome by the probability of that outcome, then adding up the products.

For mutually exclusive events, A and B, the expected value is:

$P(A)\cdot V(A) + P(B)\cdot V(B)$,

where $V(A)$ and $V(B)$ represent the value of $A$ and $B$ respectively, with a gain represented by a positive value and a loss as a negative value.

For $n$ disjoint events $A_1, A_2, ... A_n$ for which $P(A_1) + P(A_2) + ... + P(A_n)=1$, the expected value is:

$P(A_1)\cdot V(A_1) + P(A_2) \cdot V(A_2) + ... + P(A_n)\cdot V(A_n)$.

Example: You purchase a raffle ticket to help out a charity. The raffle ticket costs $5. The charity is selling 2000 tickets. One of them will be drawn and the person holding the ticket will be given a prize worth $4000. Compute the expected value for this raffle.

If your ticket is drawn, you net $4000-$5 = $3995. The probability of this is 1/2000.
If your ticket is not drawn, you net -$5. The probability of this is 1999/2000.

So the expected value is:

On average, each person is giving about $3.00 to charity.

Example: A friend offers to play a game, in which you roll 3 standard 6-sided dice. If all the dice roll different values, you give him $1. If any two dice match values, you get $2. What is the expected value of this game? Would you play?

Earthquake Insurance:
An insurance company estimates the probability of an earthquake in the next year to be 0.0013. The average damage done by an earthquake is estimated to be $60,000. If the company offers earthquake insurance for $100, what is the expected value of the policy?

## [1] 22

Is it worth it?:
Andy is always looking for ways to make money fast. Lately, he has been trying to make money by gambling. Here is the game he is considering playing: The game costs $2 to play. He draws a card from a deck. If he gets a number card (2-10), he wins nothing. For any face card ( jack, queen, or king), he wins $3. For any ace, he wins $5, and he wins an extra $20 if he draws the ace of clubs.

a, Find Andy's expected profit per game.
b. Would you play Andy's game?

Baggage Fees:
An airline charges the following baggage fees: $20 for the first bag and $30 for the second. Suppose 55% of passengers have no checked luggage, 32% have only one piece of checked luggage and 13% have two pieces. We suppose a negligible portion of people checks more than two bags.

a. What is the average baggage-related revenue per passenger?
b. About how much revenue should the airline expect for a flight of 150 passengers?

8 Conditional probability

The probability that event B occurs, given that event A has happened is represented by $P(B|A)$, read “the probability of B given A.”

Conditional probabilities can be used to find the probability of joint events, even when they are not independent:

\[P(A \text{ and } B) = P(A|B) \cdot P(B)\]

Consider the following events:

A = having power in the morning
B = the event of having a class that day

##          Class No Class Sum
## Power       90        6  96
## No Power     0        4   4
## Sum         90       10 100

What is the probability of having a class?

Question:
What is the probability of not having class?

Question:
What is the probability of having class given that there was no power when you wake up?

Question:
What is the probability of having class given that there was power when you wake up?

Quiz Question: Given that there was class, what is the probability that there was power?

Quiz Question:
Given that that there was no class, what is the probability there was no power?

9 Bayes’ Rule

By the formula for calculating join probabilities (for events that are not dependent) both \[P(A \text{ and } B) = P(A|B) \cdot P(B)\] and \[P(B \text{ and } A) = P(B|A) \cdot P(A)\] By setting these equal we get a way to “invert” conditional probabilities: \[P(A|B) \cdot P(B)=P(B|A) \cdot P(A)\] OR

\[P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}\]

If we only know $P(A)$ and $P(B|A)$, we can find $P(B)$ because:

if A occurs, the probability of B occurring is $P(B|A) \cdot P(A)$.
if A does not occur, the probability of B occurring is $P(B|A^c) \cdot P(A^c)$.

This accounts of all the ways $B$ can occur, so \[P(B) = P(B|A) \cdot P(A) + P(B|A^c) \cdot P(A^c)\].

Now, plugging in $P(B)$ gives us Bayes’ Rule!

Bayes’ Rule: Given two events $A$ and $B$,

\[ P(A|B) = \frac{P(B|A) \cdot P(A)} { P(B|A) \cdot P(A) + P(B|A^c) \cdot P(A^c)} \]

9.1 Disease #1

Example 9 - A new test has been devised to detect a new disease:

The disease has an incidence rate of 0.1%: it afflicts 0.1% of the population;
The test produces no false negatives: everyone who has the disease will test positive; but,
The false positive rate is 5%: of those who do not have the disease, 5% will test positive.

Should doctors use this test?

Well, suppose you test negative for the disease. Great, that means you don’t have it! But if you test positive, what is the probability that you actually have the disease?

Let’s first label the following events:

A = having the disease
B = testing positive

So we want to know $P($having the disease|testing positive$) =P(A|B)$.

What information do we know?

-$P($having the disease$)=P(A)=$

-$P($not having disease$=P(A^c)=$

-$P($testing negative|have the disease$)=P(B^c|A)=$

-$P($testing positive|have the disease$)=P(B|A)=$

-$P($testing positive|not having the disease$)=P(B|A^c)=$

-$P($testing negative|not having the disease$)=P(B^c|A^c)=$

Using Bayes’ Rule:

\[ P(A|B) = \frac{P(B|A) \cdot P(A)} { P(B|A) \cdot P(A) + P(B|A^c) \cdot P(A^c)} \]

## [1] 0.01962709

This shows that only about 2% of the people who test positive for this disease using this test will actually have it!

9.2 Disease #2

In class:

A new test has been devised to detect a new disease:

The incidence rate of 2% (it afflicts 2% of the population);
The produces false negatives at a rate of 0.5% (of those who have the disease, 0.5% will test negative); and
The false positive rate is 1% (of those who do not have the disease, 1% of people who test positive).

Again, label the following events:

A = having the disease
B = testing positive

Question: If you test positive, what is the probability of actually having the disease?

Question: If you test negative, what is the probability of not having the disease?

Bayes’ rule can be used for this if we replace $A$ with $A^c$, replace $B$ with $B^c$, and realize that the complement of $A^c$ is just $A$.

\[ P(A^c|B^c) = \frac{P(B^c|A^c) \cdot P(A^c)} { P(B^c|A^c) \cdot P(A^c) + P(B^c|A) \cdot P(A)} \]

10 Discrete Distributions

10.1 Bernoulli Distribution

A Bernoulli random variable is an individual trial that has two possible outcomes:

1 = success
0 = failure

Tossing a fair coin is a Bernoulli random variable. If heads = “success”, and tails = “failure”, we can plot the Bernoulli distribution of this event as follows:

We can also consider two-outcome events whereby the probability is not 50/50. Suppose an insurance company found that 30% of the people they insure will meet their deductible in a given year.

Considering a success to be the event where a person meets their deductible and a failure where a person does not meet their deductible, plot this distribution.

1. If the insurance company is selecting people in sequence for a survey, what is the probability of:

a. The first person meeting their deductible.
b. Not selecting a person who meets their deductible until the 2nd person.
c. Not selecting a person who meets their deductible until the 3rd person.
d. Not selecting a person who meets their deductible until the 4th person.

10.2 Geometric Distribution

The geometric distribution describes the probability of running x Bernoulli trials until a success occurs.

Based on the pattern above, write a formula for the probability of not selecting someone who meets their deductible until the n-th person:

\[P(n) =\]

A machine produces defective parts at a rate of 2%. You are selecting parts off the assembly line. What is the probability of not finding a defective part until the 10th selection?

P( 10 ) <- the probability of the 10th part you select is the first defective part.

Guessing on a quiz - take 2: So, you are guessing again on your quizzes! This time there are four multiple-choice questions, each with four possible answers. (A, B, C, or D) questions. Find the probability that:

a. you get the first one right?
b. the first question you get right is the 2nd one.
c. the first question you get right is the 3rd one.
d. the first question you get right is the 4th one.
e. you get no questions right?

Add up your answers to (a)-(e) above. What do you get? Does this surprise you?

10.3 Binomial Distribution

Again consider the probability of someone meeting their deductible to be 0.3.

Suppose a company has 8 people.
1. What is the probability that exactly 1 meets their deductible?
2. What is the probability that exactly 2 meet their deductible?
3. What is the probability that exactly 3 meet their deductible?

The binomial distribution describes the probability of getting a certain number of successes in a fixed number of Bernoulli trials.

If $p$ the probability of success, the probability of observing exactly $k$ successes in $n$ independent trials is given by:

\[\frac{n!}{k!(n-k)!}p^k(1-p)^{n-k}\]

This is a mess to calculate, so we can have R calculate it as follows:

“dbinom(x=k, size=n, prob = p)”

If we sampled 8 people and wanted to know the probability that exactly 3 of them met their deductible.

If we call a “success” meeting one’s deductible, we have k=3 successes out of n=8 with p=0.3.

## [1] 0.2541218

This is the same as 5 people out of 8 NOT meeting deductible, so if we call a “success” meeting one’s deductible, we have k=5 successes out of n=8 with p=0.7.

## [1] 0.2541218

Below is the distribution for how many people out of 8 meet their deductible:

For a variable to be considered to take a binomial distribution, four conditions must be met.

Number of trails, n, is fixed.
Each trial outcome is a success or failure.
The probability of success, p, is the same for each trial.
Trials must be independent.

Suppose you have four friends all of which who smoke. Are the conditions modeling this as a binomial distribution satisfied?
If we assume your friend getting lung cancer satisfies the conditions for a binomial distribution:
1. What is the probability that exactly one friend will develop lung cancer?
2. What is the probability that no one will develop lung cancer?
3. What is the probability that one or more will develop lung cancer?
Suppose the probability of a part being defective is 0.02. Find the probability that out of 10 parts, one or more are defective.
Suppose 92% of people like ice cream.
1. If I invite 5 people over for dinner, what is the chance that exactly 2 don’t like ice cream?
2. If I invite 5 people over for dinner, What is the probability of 2 or more people not liking ice cream?

10.4 Multinomial Distribution.

Look up the multinomial distribution: https://en.wikipedia.org/wiki/Multinomial_distribution

In what situations would you use it?
Can you derive the formula for the multinomial distribution?

10.5 Hypergeometric Distribution.

Look up the hypergeometric distribution: https://en.wikipedia.org/wiki/Hypergeometric_distribution

In what situations would you use it?
Can you derive the formula for the hypergeometric distribution?

11 Project Problems

In your group, select a problem you want to work on. Work together toward finding a solution to as many of the provided questions as possible, as well as any related questions you find interesting. Then create a 5-10 minute presentation over:

the problem you worked on
a solution and the tools and reasoning you used to arrive at a solution
the significance of the result and how it can contribute toward better decision making

Make sure you edit your slides to ensure it is readable with no grammar or spelling errors.

11.1 Shared birthday

Suppose two people meet. What is the probability that they share a birthday?
Suppose 3 people are in a room. What is the probability that there is at least one shared birthday among these 3 people?
Suppose 10 people are together. What is the probability that there is at least one shared birthday among these 10 people?
Suppose $n$ people are in a room. What is the probability that there is at least one shared birthday among these $n$ people?
How many people would need to be in a room until you would bet $100 on the event that at least two people share a birthday? (Use an Excel spreadsheet to find the probabilities and the expected value of the wager for $n=1,2,3,...,50$. The permute function in Excel is “=PERMUTE(n,k)”)
Now repeat steps 1-5 for finding the probability of sharing the same birth month.

11.2 Poker Odds

Compute the probability of randomly drawing five cards from a deck and getting:
1. a pair
2. three of a kind
3. straight (not a flush)
4. flush (not a straight)
5. four of a kind
6. a full house (three of a kind and a pair)
7. a straight flush (not a royal flush)
8. a royal flush

After you have answers your group is convinced of your answers, try checking your answers: https://en.wikipedia.org/wiki/Poker_probability

Suppose you have pair of a kind.
1. What is the probability that someone has a higher hand than you?
2. What is the probability that someone also has a pair?
If you have not played Texas Hold’em poker before, review the rules: https://en.wikipedia.org/wiki/Texas_hold_%27em Then use conditional probabilities to answer the following.
1. Suppose you know 2 out of 7 cards are a pair. What is the probability that someone has a 3 of a kind?
2. Suppose you know 2 out of 7 cards are a pair. What is the probability that someone has a full house?
3. Suppose you know 3 out of 7 cards are the same. What is the probability that someone has a 4 of a kind?
4. Suppose you know 3 out of 7 cards are the same. What is the probability that someone has a full house?

11.3 Playing the lottery

In a certain state’s lottery, $64$ balls numbered 1 through $64$ are placed in a machine and six of them are drawn at random. If the six numbers drawn match the numbers that a player had chosen, the player wins a jackpot of $1,000,000. If numbers drawn match any 5 of the numbers that a player had chosen, the player wins $1,000. It costs $1 to buy a ticket.

Find the expected value of playing this game.
Now suppose that over time the jackpot will increase if no one wins. How large would the jackpot have to be for the expected value of playing the lottery to be positive? (Would you buy a ticket in this case?)

P(jackpot) * V(jackpot) + P(win 1000) * V(win 1000) + P(lose) * V(lose)

Now suppose both the jackpot and the smaller 5-matching ball prize will increase proportionally over time if no one wins. How large would the jackpot have to be for the expected value of playing the lottery to be positive? (Would you buy a ticket in this case?)
Read the following story about a lottery system like this - http://archive.boston.com/news/local/massachusetts/articles/2011/07/31/a_lottery_game_with_a_windfall_for_a_knowing_few/?page=full. Summarize what happened.
Choose payouts from case 3 with a positive expected value. Suppose you buy 10 tickets with these potential payouts. Use a spreadsheet to calculate the following.
1. Calculate the probability of winning the jackpot at least once when buying 10 tickets
2. Calculate the probability of winning the smaller payout at least once when buying 10 tickets.
3. In this case, would you buy 10 tickets?
Use your work on the previous problem to calculate the probability of winning something when buying $n = 10, 100, 1000$ tickets. What is the smallest likely amount of money you would expect to gain in each of these cases?

11.4 Group assignment

If we were put into random groups again, what is the probability that YOU are placed in the exact same group??

How many different ways are there to place $n=12$ students into 3 groups? (Hint: consider an easier problem.)
1. How many different ways are there to arrange $3$ given students in a row from left to right?
2. How many different ways are there to arrange $3$ of the $12$ students in a row from left to right?
3. How many different ways are there to choose $3$ of $12$ students (order does not matter)?
4. How many different ways are there to place $6$ of the $12$ students into two uniquely labeled groups (GROUP1, GROUP2) each with $3$ students?
5. How many different ways are there to place $12$ students into 4 uniquely labeled groups: GROUP1 (3 students), GROUP2 (3 students), GROUP3 (3 students), etc.?
If we were to randomly assign groups again, what is the probability EVERYONE is placed in the exact same groups again?
If you are put in the same group, how many ways are there to assign everybody else? Use this to find the probability that, if we are randomly assigned into new groups, YOU are placed in the exact same group.
Suppose we randomly assigned groups again and you are placed in the group with exactly one person from your previous group. Do you think this is very likely? Calculate the probability of this happening.

11.5 Racial Bias

In this problem, we will explore probabilities from a series of events.

If you flip 10 coins, how many would you expect to come up “heads” on average? Given your answer, would you expect every flip of 10 coins to come up with exactly that many heads?
If you were to flip 10 coins, what results would you consider an “unusual”? What results would you consider “unusual”?
When flipping 10 coins, what is the theoretic probability of flipping 10 heads?

Below is a simulation of flipping a coin 10 times, repeated 100,000 times. In the table below, the number of times 0,1,2,…,10 heads were displayed. Note the sum of all these values is 100,000.

Table of number of time out of 100,000 flips
Number of heads	Number of trials
0	115
1	984
2	4352
3	11653
4	20459
5	24360
6	20774
7	11908
8	4310
9	978
10	107

Based on this simulation, what appears to be the probability of flipping 0 heads, 1 head, … up to 10 heads?
If you were to flip 10 coins, based on the simulated data, what range of values would you consider “usual” results? What would you consider “unusual” results?

The formula \[_n C_k p^k (1-p)^{n-k}\] will compute the probability of an event with probability $p$ occurring $k$ times out of $n$, such as flipping $k=5$ heads out of $n=10$ coins where the probability of heads is $p=0.5$.

\[_n C_r = \frac{_n P_k}{k!} = \frac{n!}{k!(n-k)!}\] You can use the Excel formula “=COMBIN(n,k)” to calculate $_nC_R$.

Use this to compute the theoretic probability of flipping 5 “heads” out of 10 coins. Compare your answer to the probability found in 4.
Use this to compute the theoretic probability of flipping fewer than 2 “heads” out of 10 coins. Compare your answer to the probability found in 4.

Use this formula to consider a case from 1960. In the area, about 26% of the jury-eligible population was black. In the court case, there were 25 people on the juror panel, of which 2 were black.

If black people were selected for the panel with a probability of 26%, calculate the probability of there being 2 or fewer black people on the juror panel.
Does this provide evidence of racial bias in jury selection?

11.6 Bayes’ Rule

Practice using Bayes’ Rule to answer the following problems.

11.6.1 Intruder Detection

An unmanned monitoring system uses high-tech video equipment and microprocessors to detect intruders. A prototype system has been developed and is in use outdoors at a weapons munitions plant. The system is designed to detect intruders with a probability of .90. However, the design engineers expect this probability to vary with the weather condition. The system automatically records the weather condition each time an intruder is detected. Based on a series of controlled tests, in which an intruder was released at the plant under various weather conditions, the following information is available:

Given the intruder was, in fact, detected by the system,

the weather was clear 75% of the time,
cloudy 20% of the time, and
raining 5% of the time.

When the system failed to detect the intruder,

60% of the days were clear,
30% cloudy, and
10% rainy.

Use this information to find the probability of detecting an intruder, given clear, cloudy, and rainy weather conditions (in the case that an intruder has been released at the plant). When is this system the most reliable? When is it the least reliable?

11.6.2 Correct Diagnosis?

Suppose a certain type of cancer has an incidence rate of 0.5% (that is, it afflicts 0.5% of the population). A new test has been devised to detect this cancer, which is very cheap and easy to administer in comparison to existing tests. The test produces false negatives at a rate of 1.4% (that is, 1.4% of those that have the disease will test negative), and the false positive rate is 1% (that is, about 1% of people who take the test will test positive, even though they do not have the disease). How accurate is this test?

Suppose a randomly selected person takes the test and tests positive. What is the probability that this person actually has the disease?
Suppose a randomly selected person takes the test and tests negative. What is the probability that this person does not have the disease?

Based on this, what recommendations would you make to doctors using this test?

BCSSI 2024 - Probability Session Notes

Paul Regier