Steph Roark
10/10/2018
For a sociology class project you are asked to conduct a survey on 20 students at your school. You decide to stand outside of your dorm’s cafeteria and conduct the survey on a random sample of 20 students leaving the cafeteria after dinner one evening. Your dorm is comprised of 45% males and 55% females.
Which probability model is most appropriate for calculating the probability that the 4th person you survey is the 2nd female? Explain.
The binomial distribution describes the probability of having exactly \( k \) successes in \( n \) independent Bernoulli trials with probability of success \( p \). The negative binomial distribution describes the probability of observing the \( k^{th} \) success on the \( n^{th} \) trial. We can use this distribution to calculate the probability that as we stand outside the cafeteria conducting the survey, the fourth person we talk to will be the second female surveyed.
P(the kth success on the nth trial) = \( {n-1 \choose k-1} p^k (1-p)^{n-k} \)
Compute the probability from part A.
n=4
k=2
p=0.55
factorial(n-1)/(factorial(k-1)*factorial(n-1-(k-1))) * (p^k) * (1-p)^(n-k)
[1] 0.1837688
dnbinom(2, size = 2, prob=.55 )
[1] 0.1837688
The three possible scenarios that lead to 4th person you survey being the 2nd female are:
In the negative binomial case, we find how many trials it takes to observe a fixed number of successes and require that the last observation be a success.
One common feature among these scenarios is that the last trial is always female. In the first three trials there are 2 males and 1 female.
Use the binomial coecient to confirm that there are 3 ways of ordering 2 males and 1 female.
\( {(n-1)!}/(k-1)!(n-k)!} \)
factorial(n-1)/(factorial(k-1)*factorial(n-1-(k-1)))
[1] 3
Use the findings presented in part C to explain why the formula for the coecient for the negative binomial is \( n-1 \choose k-1 \) while the formula for the binomial coeffecient is \( n \choose k \).
The binomial distribution describes the probability of having exactly \( k \) successes in \( n \) independent Bernoulli trials with probability of a success \( p \). The binomial coefficient formual is \( n \choose k \).
The 4th trial is specified as a success and it is removed as one of the possible scenarios. Since we know what the 4th outcome is, only the first 3 trials can be considered under the different scenarios. By subtracting 1 we are removing the final trial as a possibility in the scenarios.