Do the following (all free):
R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows, and MacOS. RStudio is a desktop environment we will be using to run statistical computations.
Below, example R code is shown in a grey box. The output of code is shown below in a white box.
2+2
## [1] 4
Clicker Question: Would you bet $10 on this?
No I wouldn’t.
Clicker Question: Would you bet $10 on this?
What is the difference between counting and finding probability? (Discuss)
Counting: How may possible ways can something occur?
When measuring discrete (whole number) outcomes, the probability of a desired outcome \(A\) is written:
\[P(A) = \frac{ \text{number of outcomes in which A occurs}}{\text{total number of possible outcomes}}\]
states = c("NY","CA","NJ","Beheira","NJ","GA","WA","NY","IN","CA","CA","AR","CA")
tab = table(states)
addmargins(tab)
## states
## AR Beheira CA GA IN NJ NY WA Sum
## 1 1 4 1 1 2 2 1 13
If I pick someone at random, what is chance that they are form CA?
4 people from CA 13 people
4/13
## [1] 0.3076923
barplot(tab/13, las=2,
ylab = "Probability",
xlab = "State",
main = "Barplot of State of Residence")
Clicker Question: Make up your own problem for calculating probability and send it in the chat!
Back to birthday problem…
What is the probability that, given two people selected, they have the same birthdays?
If there are \(n=366\) different birthdays:
\[P(\text{two given people have the same birthday}) = \frac{n}{n^2} = \frac{1}{n}\]
1/366
## [1] 0.00273224
Clicker Question:
What is the probability that, given three people selected, all three have the same birthday?
ans = 366/366^3
round(ans, 10)
## [1] 7.4651e-06
Sometimes probabilities are easier to calculate if we look at their complement.
The complement of an event \(A\) is the event “\(A\) doesn’t happen.” The notation \(A^c\) is used for the complement of event \(A\). We can compute the probability of the complement using: \[P(A^c) = 1 - P(A)\]
Note: The complement of \(A^c\) is the original event \(A\), so that \[P(A) = 1 - P(A^c)\]
Clicker Question: Make up your own problem for calculating probability using complements and send it in the chat!
Probability of someone in this group selected not being from California
A = being from CA P(A) = 4/13 P(A^c) = 1 - P(A)
1-4/13
## [1] 0.6923077
Back to birthday problem…
If event A is “any two out of 5 people have the same birthday”. The complement of A, \(A^c\), is “no two people have the same birthday”.
P(any two people out of the 5 have the same birthday) = 1-P(no people have the same birthday).
\[P(\text{no two people have the same birthday})\] \[= \frac{\text{Ways 5 people could have different birthdays}}{\text{Ways 5 people could have birthdays}}\] Ways 5 people could have different birthdays = \(366*365*364*363*362\). Ways 5 people could have birthdays = \(366^5\)
366*365*364*363*362/366^5
## [1] 0.9729379
P(any 2 people having same bday) = 1-P(no two people having same bday)
1 - 366*365*364*363*362/366^5
## [1] 0.02706214
Clicker Question:
What is the probability that any two people in this session have the same birthday?
n = 13
n = 366
1-n*(n-1)*(n-2)*(n-3)*(n-4)*(n-5)*(n-6)*(n-7)*(n-8)*(n-9)*(n-10)*(n-11)*(n-12)*(n-13)/n^14
## [1] 0.2225597
To come up with an anonymous code for each student in this session, suppose I give the following algorithm.
1. Write down the first two letters of your favorite color.
2. Write down how many siblings you have. (as two digits)
For example, if your favorite color is blue and you have 3 siblings, your code would be “BL03.”
Counting:
How many different codes are possible?
How many different codes are likely?
Probability:
What do you think the chance two people selected at random have the same code? Would you bet $10 on two codes being the same?
What do you think the chance two people in your group have the same code? Would you bet $10 on two codes being the same?
What do you think the chance two people in this class have the same code? Would you bet $10 on two codes being the same?
P(no 2 people people having same code)
number of ways people could have different codes (permutation or combination)?
1-factorial(35)/factorial(35-14+1)/35^14
## [1] 0.9977792
1-35*34/35^2
## [1] 0.02857143
1/35
## [1] 0.02857143
Example - How many different ways are there to line up 5 students?
5*4*3*2*1
## [1] 120
Factorial: \(n! = n(n-1)(n-2) ... 3 \cdot 2 \cdot 1\)
n = 5
factorial(n)
## [1] 120
Clicker Question:
How many different ways can you arrange red, blue, green, and yellow balls in a line from left to right?
Red, Blue, Green, Yellow Picking 3 balls out of 4
RBG RGB
BRG BGR
GRB GBR
4*3*2*1
## [1] 24
Clicker Question: - How many ways different ways can I pick 5 students from a class of 30 and put them in a row?
30*29*28*27*26
## [1] 17100720
factorial(30)/factorial(25)
## [1] 17100720
Permutations: The number of ways \(r\) items may be selected from among \(n\) choices (without replacement) when order matters is:
\[_n P_r = n(n-1)(n-2) ... (n-r+2)(n-r+1)\] \[ = n(n-1)(n-2) ... (n-r+2)(n-r+1)\frac{(n-r)(n-r-1) ... 3 \cdot 2 \cdot 1}{(n-r)(n-r-1) ... 3 \cdot 2 \cdot 1} \] \[ = \frac{n!}{(n-r)!}\]
30*29*28*27*26
## [1] 17100720
#OR
n = 30
r = 5
factorial(n)/factorial(n-r)
## [1] 17100720
In Desmos: Use function nPr(n,r)
Clicker Question: How many ways can a four-person executive committee (president, vice-president, secretary, treasurer) be selected from a 16-member board of directors of a non-profit organization?
16*15*14*13
## [1] 43680
factorial(16)/factorial(12)
## [1] 43680
Example - How many ways can a four-person committee be selected from a 16 members where all committee members have equal positions?
There are \(_4 P_{16}=43680\) ways to choose the members where the order matters. If it doesn’t, we overcounted. By how much? For any give 4 member selection, there are \(4!\) ways to order those 4 members, thus we overcounted by a factor of \(4!\)
43680/factorial(4)
## [1] 1820
Combinations: The number of ways \(r\) items may be selected from among \(n\) choices (without replacement) when order does NOT matter is: \[_n C_r = \frac{_n P_r}{r!} = \frac{n!}{r!(n-r)!}\] This is also denoted \(\binom{n}{r}\) and referred as “n choose r.”
So we could be fancy and calculate it like this:
n = 16
r = 4
factorial(n)/( factorial(r) * factorial(n-r) )
## [1] 1820
Or even this:
choose(16,4)
## [1] 1820
In Desmos: Use function nCr(n,r)
Clicker Question:
A group of four students is to be chosen from a 35-member class to represent the class on the student council (position/order doesn’t matter). How many ways can this be done?
factorial(35)/(factorial(31)*factorial(4))
## [1] 52360
choose(35,4)
## [1] 52360
To find how many ways can you select a 5 card hand out of a 52 card deck, is this a Permutation or Combination?
You have 4 dinners you want to cook – pasta, stir fry, and tacos, and hamburgers – Mon, Tues, Wed, and Thurs this week. To find how many ways can you could do this, is this a Permutation or Combination?
Two events are independent if the outcome of one does not affect the probability of the other. If events A and B are independent, then the probability of both \(A\) and \(B\) occurring is \[P(A \text{ and } B) = P(A) \cdot P(B)\] where \(P(A \text{ and } B)\) is the probability of events \(A\) and \(B\) both occurring, \(P(A)\) is the probability of event A occurring, and \(P(B)\) is the probability of event \(B\) occurring.
If, in fact, the first breaks down, then the second machine is turned on and has a probability of 0.30 of breaking down.
a. What is the probability that the brewery’s bottling system is not working during operating hours?
1st machine not working is A - P(A) = 0.2 2nd machine not working is B - P(B) = 0.3
P(A and B) = 0.2*0.3 = 0.06
b. The reliability of the bottling process is the probability that the system is working during operating hours. Find the reliability of the bottling process at the brewery.
1 - P(A and B) = 1 - 0.06 = 0.94
If A is the event that the first machine is broken and B is the event that the second machine is broken, the probability both are broken is: \[P(A \text{ and }) = P(A) \cdot P(B)\] (If we assume A and B are independent events.)
# a.
Find the probability that at least one bottling system is working (reliability) is the complement of this this answer.
# b.
Suppose we flipped a coin and rolled a die, and wanted to know the probability of getting a head on the coin or a 6 on the die.
P(H) = 1/2 P(6) = 1/6
P(H) + P(6) = 1/2 + 1/6 = 6/12 + 2/12 = 8/12 P(H or 6) = 7/12
Here, there are still 12 possible outcomes: {H1,H2,H3,H4,H5,H6,T1,T2,T3,T4,T5,T6}
How many outcomes have heads? 6/12
How many outcomes have a 6? 2/12
How many outcomes have heads OR a 6? 7/12
How many outcomes have heads AND a 6? 1/12
The probability of either of two events occurring occurring simultaneously is \[P(A \text{ or } B)=P(A)+P(B)-P(A \text{ and } B)\].
Clicker Question: Suppose you draw two cards from a deck. What is the probability of getting a spade or an ace?
52 cards 13 spades 4 aces
P(spade or ace) = P(space) + P(ace) - P(spade and ace) = 13/52 + 4/52 - 1/52
A multiple-choice question on an quiz contains 5 questions with four possible answers each. Compute the probability of randomly guessing the answers and getting a 100% on the quiz (all five questions correct).
4*4*4*4*4
## [1] 1024
4^5
## [1] 1024
What is the probability of getting:
1/1024
## [1] 0.0009765625
(1/4)^5
## [1] 0.0009765625
b. 3 out of 5 questions correct?
There is 1/4 chance to get an answer correct, so only 3 correct must be (1/4)^3 * (3/4)^2. However, we need to pick which 3 questions out of the 5 are right. This is combinations. The answer should be choose(5,3) * (1/4)^3 * (3/4)^2
choose(5,3) * (1/4)^3 * (3/4)^2
## [1] 0.08789062
c. 2 out of 5 questions correct?
need to pick 2 our of 5 where only two are correct
choose(5,2) * (1/4)^2 * (3/4)^3
## [1] 0.2636719
135/512
## [1] 0.2636719
d. 1 out of 5 questions correct?
choose(5,1) * (1/4)^1 * (3/4)^4
## [1] 0.3955078
405/1024
## [1] 0.3955078
e. none correct?
(3/4)^5
## [1] 0.2373047
243/1024
## [1] 0.2373047
(1/4)^5 +
choose(5,4) * (1/4)^1 * (3/4)^4 +
choose(5,3) * (1/4)^2 * (3/4)^3 +
choose(5,2) * (1/4)^3 * (3/4)^2 +
choose(5,1) * (1/4)^4 * (3/4)^1 +
(3/4)^5
## [1] 1
Two events are mutually exclusive if they cannot happen at the same time, so \(P(A \text{ and } B) = 0\). If A and B are mutually exclusive, then \[P(A \text{ or } B) = P(A) + P(B)\]
Example One the quiz, what is the probability of getting a “B” or higher (at least 4 out of 5 answers correct)?
P(getting exactly 5 correct or getting exactly 4 correct) = P(getting 5 correct) + P(getting 4 correct)
1/4^5 + choose(5,4) * (1/4)^1 * (3/4)^4
## [1] 0.3964844
Clicker Question: Which is more likely, passing the quiz with a D or higher (at least 3 out of 5 correct), or getting all 5 wrong?
P(all correct) = (1/4)^5 P(4 correct) = choose(5,4) * (1/4)^4 * (3/4)^1 P(3 correct) = choose(5,3) * (1/4)^3 * (3/4)^2 P(2 correct) = choose(5,2) * (1/4)^2 * (3/4)^3
P(1 correct) = choose(5,1) * (1/4)^1 * (3/4)^4 P(0 correct) = (3/4)^5
Get a D or higher = all correct, 4 correct or 3 correct
1/4^5 + choose(5,4) * (1/4)^4 * (3/4)^1 + choose(5,3) * (1/4)^3 * (3/4)^2
## [1] 0.1035156
(3/4)^5
## [1] 0.2373047
Expected value provides a way of evaluating the value of a decision of multiple outcomes.
Expected Value defined as the average gain or loss of an event if the procedure is repeated many times. We can compute the expected value by multiplying each outcome by the probability of that outcome, then adding up the products.
Example - You purchase a raffle ticket to help out a charity. The raffle ticket costs $5. The charity is selling 2000 tickets. One of them will be drawn and the person holding the ticket will be given a prize worth $4000. Compute the expected value for this raffle.
So expected value is:
3995*1/2000 + -5*1999/2000
## [1] -3
On average, each person is giving about $3.00 to charity.
p_win = 0
v_win = 0
p_lose = 0
v_lose = 0
p_win*v_win + p_lose*v_lose
## [1] 0
Clicker Question: An insurance company estimates the probability of an earthquake in the next year to be 0.0013. The average damage done by an earthquake it estimates to be $60,000. If the company offers earthquake insurance for $100, what is their expected value of the policy?
The probability the event B occurs, given that event A has happened is represented by \(P(B|A)\), read “the probability of B given A.”
Contional probabilities can be used to find the probability of joint events, even when they are not independent:
\[P(A \text{ and } B) = P(A|B) \cdot P(B)\]
Consider the following events:
## Class No Class Sum
## No Snow 94 1 95
## Snow 2 3 5
## Sum 96 4 100
What is the probability of having class?
Clicker Question:
What is the probability of not having class?
What is the probability of having class given that there was snow when you wake up?
Clicker Question:
What is the probability of having class given that there was no snow when you wake up?
Given that there was class, what is the probability that there was power?
Clicker Question:
Given that that there was no class class, what is the probability there was no power?
By the the formula for calculating join probabilities (for events that are not dependent) both \[P(A \text{ and } B) = P(A|B) \cdot P(B)\] and \[P(B \text{ and } A) = P(B|A) \cdot P(A)\] By setting these equal we get a way to “invert” conditional probabilities: \[P(A|B) \cdot P(B)=P(B|A) \cdot P(A)\] OR
\[P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}\]
If we only know \(P(A)\) and \(P(B|A)\), we can find \(P(B)\) because:
This accounts of all the ways \(B\) can occur, so \[P(B) = P(B|A) \cdot P(A) + P(B|A^c) \cdot P(A^c)\].
Now, plugging in \(P(B)\) gives us Bayes’ Rule!
Bayes’ Rule: Given two events \(A\) and \(B\),
\[ P(A|B) = \frac{P(B|A) \cdot P(A)} { P(B|A) \cdot P(A) + P(B|A^c) \cdot P(A^c)} \]
Example - A new test has been devised to detect a new disease:
Should doctors use this test?
Well, suppose you test negative for the disease. Great, that means you don’t have it! But if you test positive, what is the probability that you actually have the disease?
Let’s first label the following events:
So we want to know \(P(\)having the disease|testing positive\() =P(A|B)\).
What information do we know?
Using Bayes’ Rule:
\[ P(A|B) = \frac{P(B|A) \cdot P(A)} { P(B|A) \cdot P(A) + P(B|A^c) \cdot P(A^c)} \]
# Example
(1*0.001)/(1*0.001+0.05*0.999)
## [1] 0.01962709
This shows that only about 2% of the people who test positive for this disease using this test will actually have it!
Example - A new test has been devised to detect a new disease:
Again, label the following events:
Clicker Question: If you test positive, what is the probability of actually having the disease?
Clicker Question: If you test negative, what is the probability of not having the disease?
Baye’s rule can be used for this, if we repace \(A\) with \(A^c\), replace \(B\) with \(B^c\), and realize that the complement of \(A^c\) is just \(A\).
\[ P(A^c|B^c) = \frac{P(B^c|A^c) \cdot P(A^c)} { P(B^c|A^c) \cdot P(A^c) + P(B^c|A) \cdot P(A)} \]
In your subgroup, select a problem you want to work on. Work together toward finding a solution to the provided questions and/or any related questions you find interesting. Then write up and present:
On Friday, each group will share their work, 8-10 minutes to present, followed by 5 minutes for questions.
Note: For working on problems, have a student volunteer for the following roles:
Everyone should be involved in solving the problem!
Compute the probability of randomly drawing five cards from a deck and getting:
a. a pair
b. three of a kind
c. four of a kind
d. a full house (three of a kind and a pair)
e. a flush (all the same suit)
After you have answers your group is convinced of your answers, try checking your answers: https://en.wikipedia.org/wiki/Poker_probability
Suppose you have three of a kind. What is the probability that someone else has a higher hand? (you can use the probabilities given for a straight and straight flush on wikipedia for answering this one.)
In a certain state’s lottery, \(64\) balls numbered 1 through \(64\) are placed in a machine and six of them are drawn at random. If the six numbers drawn match the numbers that a player had chosen, the player wins jackpot $1,000,000. If numbers drawn match any 5 of the numbers that a player had chosen, the player wins $1,000. It costs $1 to buy a ticket. Find the expected value.
Over time the jackpot will increase if no one wins. How large would the jackpot have to be for the expected value of playing the lottery to be positive? (In this case, would you still buy a lottery ticket?)
If we were put to randomly groups again, what is the probability that YOU are placed in the exact same group??
How many different ways are there to place \(n=30\) students into 6 groups? (Hint: consider an easier problem.)
If we were to randomly assign groups again, what is the probability EVERYONE is placed in the exact same groups again?
If you are put in the same group, how many ways are their to assign everybody else? Use this to find the probability that, if we are randomly assigned into new groups, YOU are placed in the exact same group?
An unmanned monitoring system uses high-tech video equipment and microprocessors to detect intruders. A prototype system has been developed and is in use outdoors at a weapons munitions plant. The system is designed to detect intruders with a probability of 0.90. However, the design engineers expect this probability to vary with weather condition. The system automatically records the weather condition each time an intruder is detected. Based on a series of controlled tests, in which an intruder was released at the plant under various weather conditions, the following information is available.
Given the intruder was in fact detected by the system, the weather was:
When the system failed to detect the intruder:
Use this information to find the probability of detecting an intruder, given clear, cloudy, and rainy weather conditions (in the case that an intruder has been released at the plant). When is this system the most reliable? When is it the least reliable?
Suppose a certain type of cancer has an incidence rate of 0.5% (that is, it afflicts 0.5% of the population). A new test has been devised to detect this cancer, which is very cheap and easy to administer in comparison to existing tests. The test produces false negatives at a rate of 1.4% (that is, 1.4% of those that have the disease will test negative), and the false positive rate is 1% (that is, about 1% of people who take the test will test positive, even though they do not have the disease). How accurate is this test?
Based on this, what recommendations would you make to doctors using this test.
In this problem, we will explore probabilities from a series of events.
If you flip 10 coins, how many would you expect to come up “heads” on average? Given your answer, would you expect every flip of 10 coins to come up with exactly that many heads?
If you were to flip 10 coins, what results would you consider a “usual”? What results would you consider “unusual”?
When flipping 10 coins, what is the theoretic probability of flipping 10 heads?
Below is a simulation of flipping a coin 10 times, repeated 100,000 times. In the table below, the number of times 0,1,2,…,10 heads were displayed. Note the sum of all these values is 10,000.
Based on the this simulation, what appears to be the probability of flipping 0 heads, 1 heads, … up to 10 heads? (Hint: Use Spreadsheet)
If you were to flip 10 coins, based on the simulated data, what range of values would you consider “usual” results? What would you consider “unusual” results?
The formula \[_n C_k p^k (1-p)^{n-k}\] will compute the probability of an event with probability \(p\) occurring \(k\) times out of \(n\), such as flipping \(k=5\) heads out of \(n=10\) coins where the probability of heads is \(p=0.5\).
\[_n C_r = \frac{_n P_k}{k!} = \frac{n!}{k!(n-k)!}\] You can use the Excel formula “=COMBIN(n,k)” to calculate \(_nC_R\).
Use this to compute the theoretic probability of flipping 5 “heads” out of 10 coins. Compare your answer to the probability found in 4.
Use this to compute the theoretic probability of flipping fewer than 2 “heads” out of 10 coins. Compare your answer to the probability found in 4.
Use this formula to consider a case from the 1960. In the area, about 26% of the jury eligible population was black. In the court case, there were 25 people on the juror panel, of which 2 were black.
If black people were selected for the panel with probability 26%, calculate the probability of there being 2 or fewer black people on the juror panel.
Does this provide evidence of racial bias in jury selection?