library(ggplot2)
Example: How many different ways can the first three places be decided in a race with four runners?
Try the code
factorial(4)/factorial(4-3)
YOUR TURN:
Exercise 1: How many different arrangements of the letters MISSISSIPPI?
YOUR CODE HERE:
result <- factorial(11) / (factorial(1) * factorial(4) * factorial(4) * factorial(2))
result
[1] 34650
Exercise 2: How many different license plates can be made from four digits? Digits can be repeated.
YOUR CODE HERE:
n <- 10 # Number of digits
r <- 4 # Number of positions
result <- n^r
result
[1] 10000
Example: If nine people are to be assigned into three committees of sizes two, three, and four, respectively, how many possible assignments exist
choose(9, 2) * choose(7, 3) * choose(4, 4)
YOUR TURN:
Exercise 3: If 20 people are to be assigned into three committees of sizes 5, 7, and 8, respectively, how many possible assignments exist?
YOUR CODE HERE:
# Function to calculate combinations
combinations <- function(n, k) {
return(factorial(n) / (factorial(k) * factorial(n - k)))
}
# Number of people and committee sizes
total_people <- 20
committee_sizes <- c(5, 7, 8)
# Calculate the number of possible assignments
assignments <- 1
for (size in committee_sizes) {
assignments <- assignments * combinations(total_people, size)
total_people <- total_people - size
}
assignments
[1] 99768240
Example: Susie has 25 books she would like to arrange on her desk. Of the 25 books, 7 are statistics books, 6 are biology books, 5 are English books, 4 are history books, and 3 are psychology books. If Susie arranges her books by subject, how many ways can she arrange her books?
Solution:
factorial(5) * factorial(7) * factorial(6) * factorial(5) * factorial(4) * factorial(3)
Example: Complement of an Event: Birthday Problem
Suppose that a room contains m students. What is the
probability that at least two of them have the same birthday? This is a
famous problem with a counter intuitive answer. Assume that every day of
the year is equally likely to be a birthday, and disregard leap years.
That is, assume there are always n = 365 days to a
year.
Solution: Use \(P(E) = 1 -P(E^c)\)
where \(E^c\) is the event no two students have same birthday.
One has \(P(E^c)=\frac{365*364* ... (365-m+1)}{365^m}\). Can you explain why?
m <- seq(from = 10,to = 50, by = 5) # assigns to vector m values 10, 15, 20, ... 50, starting at 10 and using increment of 5 until 50
# create a function that takes as parameter vector m( different number of students) and computes the probability
P.E <- function(m){
c(Students = m, ProbAtL2SB = 1 - prod((365:(365 - m + 1)/365)))
}
t(sapply(m, P.E)) # transposes the matrix with the result
Display how the probability changes with the number of students
m growing.
m <- 1:60 # vector of number of students
p <- numeric(60) # initialize vector of 0's
for(i in m){ # index values for loop with for operator
q = prod((365:(365 - i + 1))/365) # P(No Match) if i people in room, using the product function prod()
p[i] = 1 - q}
#create data frame with two columns m and p
prob_df <- data.frame(m,p)
ggplot(prob_df, aes(m,p)) +
geom_point(color = "blue") +
geom_hline(yintercept = 0.5) + # add horizontal line at y = 0.5 to see for what number of students the P goes above it
geom_vline(xintercept = 23)
Example: Suppose two fair dice are rolled where each
of the 36 possible outcomes is equally likely to occur.
Knowing that the first die shows a 4, what is the
probability that the sum of the two dice equals 8?
Use the conditional probability formula. Use the code below for the computations
library(MASS) # used for fractions function
# create the sample space using expand.grid()
Omega <- expand.grid(roll1 = 1:6, roll2 = 1:6)
H <- subset(Omega, roll1 + roll2 == 8)
H
G <- subset(Omega, roll1 == 4)
G
PG <- dim(G)[1]/dim(Omega)[1] # P(G) as the ratio of # of outcomes in event G over # all possible outcome
fractions(PG) # fractions() finds rational approximations to the components of a real numeric object
HaG <- subset(Omega, roll1 == 4 & roll2 ==4) # event H and G
HaG
PHaG <- dim(HaG)[1]/dim(Omega)[1] # P(H and G)
fractions(PHaG)
PHgG <- PHaG/PG # P(H|G) conditional probability (H given G)
fractions(PHgG)
Alternatively, P(H/G), by enumerating the number of
outcomes in H in the reduced sample space
G.
library(MASS)
Omega <- expand.grid(roll1 = 1:6, roll2 = 1:6)
G <- subset(Omega, roll1 == 4) # event G
G
HgG <- subset(G, roll1 + roll2 == 8) # event H|G
HgG
HgG <- subset(G, roll2 == 4) # event H|G
HgG
HgG <- subset(G, roll2 %in% 4) # event H|G
HgG
PHgG <- dim(HgG)[1]/dim(G)[1] # P(H|G)
fractions(PHgG)
Example: Choose a Door . The television show Let’s Make a Deal, hosted by Monty Hall, gave contestants the opportunity to choose one of three doors. Contestants hoped to choose the one that concealed the grand prize. Behind the other two doors were much less valuable prizes. After the contestant chose one of the doors, say Door 1, Monty opened one of the other two doors, say Door 3, containing a much less valuable prize. The contestant was then asked whether he or she wished to stay with the original choice (Door 1) or switch to the other closed door (Door 2). What should the contestant do? Is it better to stay with the original choice or to switch to the other closed door? Or does it really matter? The answer, of course, depends on whether contestants improve their chances of winning by switching doors. In particular, what is the probability of winning by switching doors when given the opportunity; and what is the probability of winning by staying with the initial door selection? First, simulate the problem with R to provide approximate probabilities for the various strategies. Following the simulation below, show how Bayes’ Rule can be used to solve the problem exactly.Use the code below for the computations
set.seed(2) # done for reproducibility
# winning door simulated
actual <- sample(x = 1:3, size = 10000, replace = TRUE) # draws sample of size 10,000 from 1,2,3 with replacement
# guess simulated
aguess <- sample(x = 1:3, size = 10000, replace = TRUE)
# see when there is match
equals <- (actual == aguess)
# compute the probability when no switch
PNoSwitch <- sum(equals)/10000
# when there is NO match
not.eq <- (actual != aguess)
# compute the probability when switch
PSwitch <- sum(not.eq)/10000
Probs <- c(PNoSwitch, PSwitch)
names(Probs) <- c("P(Win no Switch)", "P(Win Switch)")
Probs
Do the math on paper as well!
What do you see?
Is it better for the contestant to switch doors in general? If the contestant wants to maximize their chances of winning, they should always choose to switch doors when given the opportunity in the Monty Hall problem.
Exercise 4: A hat contains 50 consecutive numbers (1 to 50). If four numbers are drawn at random, how many ways are there for the largest number to be a 16 and the smallest number to be a 5?
Solution: Explain why! Use the Fundamental Principle of Counting!
choose(10, 2)
The result will give you the total number of ways to select four numbers such that the largest number is 16 and the smallest number is 5 from a set of 50 consecutive numbers.
Exercise 5: A multiple-choice test consists of 10 questions. Each question has 5 answers (only one is correct). How many different ways can a student fill out the test?
The expression 5^10 represents the number of ways a student can choose one of the 5 possible answers for each of the 10 questions. This is because for each question, the student has 5 choices, and these choices are independent of each other.
YOUR CODE HERE:
5^10
[1] 9765625
****Exercise 6:** On a multiple-choice exam with three possible answers for each of the five questions, what is the probability that a student would get four or more correct answers just by guessing?
Hint: Use the fact that \(P(E ∩ F) = P(E) · P(F)\) for two independent event (generalized for more than events) Getting one answer correct is independent of another.
Also \[P(at least 4) = P(exactly 4) + P(exactly 5)\], the last events are mutually exclusive so \(P(A \cup B) = P(A) + P(B)\)
Review code and explain it. YOUR CODE HERE:
choose(5, 4)*(1/3)^4*(2/3)^1 + choose(5, 5)*(1/3)^5*(2/3)^0
Basically, this code computes the probability of achieving the desired outcome (four or more correct answers) by considering both cases (exactly four correct and all five correct) and adding their probabilities together. This is done using the principles of probability and combinations.
Exercise 7:
A family has three cars, all with electric windows. Car A’s windows
always work. Car B’s windows work 30% of the time, and Car
C’s windows work 75% of the time. The family uses Car A
2/3 of the time; Car B, 2/9 of the time; and
Car C, the remaining fraction.
(a) On a particularly hot day, when the family wants to roll the windows down, compute the probability the windows will work.
(b) If the electric windows work, find the probability the family is driving Car C.
Solution: Let A, B, and C be the events of using the cars A, B, and C respectively and let T be the event windows work properly. (a) \(P(T) = P(T|A)P(A) + P(T|B)P(B) + P(T|C)P(C) = ...\)
Did you get 0.8167?