Unit 3 Portfolio Problems

1 Poker Odds

Compute the probability of randomly drawing five cards from a deck and getting:

a. a pair 42.26%
b. three of a kind
c. four of a kind
d. a full house (three of a kind and a pair)
e. a flush (all the same suit)

a. 42.2569%
b. 2.1128
c. 0.02401%
d. 0.1441%
e. 0.1965%

After you have answers your group is convinced of your answers, try checking your answers: https://en.wikipedia.org/wiki/Poker_probability

Suppose you have three of a kind. What is the probability that someone else has a higher hand? (you can use the probabilities given for a straight and straight flush on wikipedia for answering this one.)

#Pair
choose(13,1)*choose(4,2)*choose(12,3)*4^3

## [1] 1098240

choose(13,1)*choose(4,2)*choose(12,3)*4^3/choose(52,5)

## [1] 0.422569

(0.3925+0.1965+0.1441+0.02401+0.00139+0.000154)/100

## [1] 0.00758654

#Three of a kind
choose(13,1)*choose(4,3)*choose(12,2)*4^2

## [1] 54912

choose(13,1)*choose(4,3)*choose(12,2)*4^2/choose(52,5)

## [1] 0.02112845

#Four of a kind
choose(13,1)*choose(4,4)*choose(12,1)*choose(4,1)/choose(52,5)

## [1] 0.000240096

#Flush
choose(13,5)*choose(4,1)/choose(52,5)

## [1] 0.001980792

#Flush minus straight flushes
(choose(13,5)*choose(4,1) - choose(10,1)*choose(4,1))/choose(52,5)

## [1] 0.001965402

2 Playing the lottery

In a certain state’s lottery, $64$ balls numbered 1 through $64$ are placed in a machine and six of them are drawn at random. If the six numbers drawn match the numbers that a player had chosen, the player wins jackpot $1,000,000. If numbers drawn match any 5 of the numbers that a player had chosen, the player wins $1,000. It costs $1 to buy a ticket. Find the expected value.

Over time the jackpot will increase if no one wins. How large would the jackpot have to be for the expected value of playing the lottery to be positive? (In this case, would you still buy a lottery ticket?)

P(jackpot) * V(jackpot) + P(win 1000) * V(win 1000) + P(lose) * V(lose)

P_jackpot = 1/choose(64,6) 
V_jackpot = 1000000 - 1
P_1000 = choose(6,5)*(64-6)/choose(64,6)
V_1000 = 1000 - 1
P_lose = 1- P_jackpot - P_1000
P_lose = (choose(64,6)-1-choose(6,5)*(64-6))/choose(64,6)
V_lose = -1
P_jackpot*V_jackpot + P_1000*V_1000 + P_lose*V_lose

## [1] -0.9820205

1/7624512

## [1] 1.311559e-07

(1000000-1)/74974368+(1000-1)*348/74974368-1*(1-(1+348)/74974368)

## [1] -0.9820205

choose(64,6)

## [1] 74974368

V_jackpot = 75000000
P_jackpot*V_jackpot + P_1000*V_1000 + P_lose*V_lose

## [1] 0.004983476

~ $75,000,000

1/choose(64,5)

## [1] 1.311559e-07

3 Group assignment

Suppose you had problems working with your group on last projects unit. Some were arguing and some were not contributing to the work on the group, so was decided that students will be randomly placed into 6 groups on the next project. Should you worry about being put into the same group??

Use the following questions to develop an answer to the question: If we were put to randomly groups again, what is the probability that YOU are placed in the exact same group?

Use the following steps to build the parts needed to find how many different ways are there to place $n=30$ students into 6 groups.
1. How many different ways are there to arrange $5$ given students in a row from left to right?

factorial(5)

## [1] 120

b. How many different ways are there to arrange $5$ of the $30$ students in a row from left to right?

choose(30,5)*factorial(5)

## [1] 17100720

c. Hw many different ways are there to **choose** $5$ of $30$ students (order does not matter)?

choose(30,5)

## [1] 142506

d. How many different ways are there to place $10$ of the $30$ students into two uniquely labeled groups (GROUP1, GROUP2) each with $5$ students?

choose(10,5)*1

## [1] 252

e. How many different ways are there to place $30$ students into 6 uniquely labeled groups: GROUP1 (5 students), GROUP2 (5 students), GROUP3 (5 students), etc.?

choose(30,5)*choose(25,5)*choose(20,5)*choose(15,5)*choose(10,5)

## [1] 8.883265e+19

If we were to randomly assign groups again, what is the probability EVERYONE is placed in the exact same groups again?

1/(choose(30,5)*choose(25,5)*choose(20,5)*choose(15,5)*choose(10,5))

## [1] 1.125712e-20

If you are put in the same group, how many ways are their to assign everybody else? Use this to find the probability that, if we are randomly assigned into new groups, YOU are placed in the exact same group?

choose(25,5)*choose(20,5)*choose(15,5)*choose(10,5)

## [1] 6.233607e+14

1/(choose(25,5)*choose(20,5)*choose(15,5)*choose(10,5))

## [1] 1.604208e-15

Based on these results, if an instructor is randomly putting students in groups, should students worry about being put in the exact same group?

4 Shared birthday

Suppose two people meet. What is the probability that they share a birthday?

1/365

## [1] 0.002739726

Suppose 3 people are in a room. What is the probability that there is at least one shared birthday among these 3 people?

1-365*364*363/365^3

## [1] 0.008204166

Suppose 10 people are together. What is the probability that there is at least one shared birthday among these 10 people?
Suppose $n$ people are in a room. What is the probability that there is at least one shared birthday among these $n$ people?
How many people would need to be in a room until you would bet $100 on the event that at least two people share a birthday? (Use an Excel spreadsheet to find the probabilities and the expected value of the wager for $n=1,2,3,...,50$. The permute function in Excel is “=PERMUTE(n,k)”)

# Fast permute function
permute = function(n,k) {
  sum = n
  for ( i in 1:(k-1) ) {
    sum = sum*(n-i) 
  }
  sum
}

# b is a vector for storing values
a = c()
# With one person, zero chance of having a matching birthday
a[1] = 0

n = 365
# Calculate probability for k=2 up to 50
for ( k in 2:70 ) {
  a[k] = 1 - permute(n,k)/n^k
}
# label b 
names(a) = 1:70
# Print solution
round(a,4)

##      1      2      3      4      5      6      7      8      9     10     11 
## 0.0000 0.0027 0.0082 0.0164 0.0271 0.0405 0.0562 0.0743 0.0946 0.1169 0.1411 
##     12     13     14     15     16     17     18     19     20     21     22 
## 0.1670 0.1944 0.2231 0.2529 0.2836 0.3150 0.3469 0.3791 0.4114 0.4437 0.4757 
##     23     24     25     26     27     28     29     30     31     32     33 
## 0.5073 0.5383 0.5687 0.5982 0.6269 0.6545 0.6810 0.7063 0.7305 0.7533 0.7750 
##     34     35     36     37     38     39     40     41     42     43     44 
## 0.7953 0.8144 0.8322 0.8487 0.8641 0.8782 0.8912 0.9032 0.9140 0.9239 0.9329 
##     45     46     47     48     49     50     51     52     53     54     55 
## 0.9410 0.9483 0.9548 0.9606 0.9658 0.9704 0.9744 0.9780 0.9811 0.9839 0.9863 
##     56     57     58     59     60     61     62     63     64     65     66 
## 0.9883 0.9901 0.9917 0.9930 0.9941 0.9951 0.9959 0.9966 0.9972 0.9977 0.9981 
##     67     68     69     70 
## 0.9984 0.9987 0.9990 0.9992

Ways no two people share same birthday

# b is a vector for storing values
b = c()
# With one person, zero chance of having a matching birthday
b[1] = 0

n = 365
# Calculate probability for k=2 up to 50
for ( k in 2:70 ) {
  b[k] = 1 - (364/365)^(choose(k,2))
}
# label b 
names(b) = 1:70
# Print solution
round(a,4)[1:30]

##      1      2      3      4      5      6      7      8      9     10     11 
## 0.0000 0.0027 0.0082 0.0164 0.0271 0.0405 0.0562 0.0743 0.0946 0.1169 0.1411 
##     12     13     14     15     16     17     18     19     20     21     22 
## 0.1670 0.1944 0.2231 0.2529 0.2836 0.3150 0.3469 0.3791 0.4114 0.4437 0.4757 
##     23     24     25     26     27     28     29     30 
## 0.5073 0.5383 0.5687 0.5982 0.6269 0.6545 0.6810 0.7063

round(b,4)[1:30]

##      1      2      3      4      5      6      7      8      9     10     11 
## 0.0000 0.0027 0.0082 0.0163 0.0271 0.0403 0.0560 0.0739 0.0940 0.1161 0.1401 
##     12     13     14     15     16     17     18     19     20     21     22 
## 0.1656 0.1926 0.2209 0.2503 0.2805 0.3114 0.3428 0.3745 0.4062 0.4379 0.4694 
##     23     24     25     26     27     28     29     30 
## 0.5005 0.5310 0.5609 0.5900 0.6182 0.6455 0.6717 0.6968

5 Racial Bias

In this problem, we will explore probabilities from a series of events.

If you flip 10 coins, how many would you expect to come up “heads” on average? Given your answer, would you expect every flip of 10 coins to come up with exactly that many heads?
If you were to flip 10 coins, what results would you consider a “usual”? What results would you consider “unusual”?
When flipping 10 coins, what is the theoretic probability of flipping 10 heads?

Below is a simulation of flipping a coin 10 times, repeated 100,000 times. In the table below, the number of times 0,1,2,…,10 heads were displayed. Note the sum of all these values is 100,000.

outcomes = c("heads","tails")
trial = c()
for ( i in 1:100000 )
{
  sam = sample( outcomes, replace = T, 10)
  trial[i] = sum( sam=="heads" )
}
kable(table(trial), caption="Table of number of of time  out of 100,000 flips",
      col.names = c("Number of heads", "Number of trials"))

Table of number of of time out of 100,000 flips
Number of heads	Number of trials
0	91
1	986
2	4395
3	11657
4	20367
5	24659
6	20579
7	11710
8	4424
9	1028
10	104

Based on the this simulation, what appears to be the probability of flipping 0 heads, 1 heads, … up to 10 heads?
If you were to flip 10 coins, based on the simulated data, what range of values would you consider “usual” results? What would you consider “unusual” results?

The formula \[_n C_k p^k (1-p)^{n-k}\] will compute the probability of an event with probability $p$ occurring $k$ times out of $n$, such as flipping $k=5$ heads out of $n=10$ coins where the probability of heads is $p=0.5$.

\[_n C_r = \frac{_n P_k}{k!} = \frac{n!}{k!(n-k)!}\] You can use the Excel formula “=COMBIN(n,k)” to calculate $_nC_R$.

Use this to compute the theoretic probability of flipping 5 “heads” out of 10 coins. Compare your answer to the probability found in 4.

0.5

## [1] 0.5

choose(10,5)*0.5^5*0.5^5

## [1] 0.2460938

dbinom(5,size=10,prob=0.5)

## [1] 0.2460938

Use this to compute the theoretic probability of flipping fewer than 2 “heads” out of 10 coins. Compare your answer to the probability found in 4.

k = 0:2
sum(dbinom(k,size=10,prob=0.5))

## [1] 0.0546875

Use this formula to consider a case from the 1960. In the area, about 26% of the jury eligible population was black. In the court case, there were 25 people on the juror panel, of which 2 were black.

If black people were selected for the panel with probability 26%, calculate the probability of there being 2 or fewer black people on the juror panel.
Does this provide evidence of racial bias in jury selection?

p=0.26
k = 0:2
sum(dbinom(k,size=25, prob=0.26))

## [1] 0.02518877

p=0.26
k = 0:10
d = (dbinom(k,size=25, prob=0.26))*100
names(d) = 0:10
barplot(d, main = "Chance of r black people on 25 person jury", col = rainbow(10), ylim = c(0,20), xlab = "r", ylab = "probability (%)", )

6 Intruder detection

An unmanned monitoring system uses high-tech video equipment and microprocessors to detect intruders. A prototype system has been developed and is in use outdoors at a weapons munitions plant. The system is designed to detect intruders with a probability of .90. However, the design engineers expect this probability to vary with weather condition. The system automatically records the weather condition each time an intruder is detected. Based on a series of controlled tests, in which an intruder was released at the plant under various weather conditions, the following information is available: Given the intruder was, in fact, detected by the system, the weather was clear 75% of the time, cloudy 20% of the time, and raining 5% of the time. When the system failed to detect the intruder, 60% of the days were clear, 30% cloudy, and 10% rainy. Use this information to find the probability of detecting an intruder, given clear, cloudy, and rainy weather conditions (in the case that an intruder has been released at the plant). When is this system the most reliable? When is it the least reliable?

# Intruder detection

# d = detect intruder, f = fails to detect intruder
p_d = 0.90
p_f = 1 - p_d

p_clear_d  = 0.75
p_cloudy_d = 0.20
p_rainy_d  = 0.05

p_clear_f  = 0.60
p_cloudy_f = 0.30
p_rainy_f  = 0.10

p_clear  = p_clear_d  * p_d + p_clear_f  * p_f
p_cloudy = p_cloudy_d * p_d + p_cloudy_f * p_f
p_rainy  = p_rainy_d  * p_d + p_rainy_f  * p_f

# Check - should add up to 100%
p_clear + p_cloudy + p_rainy

## [1] 1

p_d_clear  = p_clear_d  * p_d / p_clear
p_d_cloudy = p_cloudy_d * p_d / p_cloudy
p_d_rainy  = p_rainy_d  * p_d / p_rainy

p_d_clear

## [1] 0.9183673

p_d_cloudy

## [1] 0.8571429

p_d_rainy

## [1] 0.8181818

7 Correct Diagnosis?

Suppose a certain type of cancer has an incidence rate of 0.5% (that is, it afflicts 0.5% of the population). A new test has been devised to detect this cancer, which is very cheap and easy to administer in comparison to existing tests. The test produces false negatives at a rate of 1.4% (that is, 1.4% of those that have the disease will test negative), and the false positive rate is 1% (that is, about 1% of people who take the test will test positive, even though they do not have the disease). How accurate is this test?

Suppose a randomly selected person takes the test and tests positive. What is the probability that this person actually has the disease?
Suppose a randomly selected person takes the test and tests negative. What is the probability that this person does not have the disease?

Based on this, what recommendations would you make to doctors using this test.

10000*0.005

## [1] 50

50*0.014

## [1] 0.7

(10000-50)*0.01

## [1] 99.5

matrix( c(50-0.7,99.5,49.3+99.5,
          0.7,10000-50-99.5,0.7+9850.5,
          50,10000-50,10000),
        nc=3, nr=3, byrow=T,
              dimnames = list( c("+","-","total"),c("Disease","No Disease", "total"))
              )

##       Disease No Disease   total
## +        49.3       99.5   148.8
## -         0.7     9850.5  9851.2
## total    50.0     9950.0 10000.0

#P(disease|pos)
49.3/148.8

## [1] 0.3313172

#P(no disease|neg)
9850.5/9851.2

## [1] 0.9999289

# A = have disease
# B = test positive
P_A = 0.005
P_Ac = 1-P_A

P_Bc_A = 0.014 # False negative
P_B_A = 1 - P_Bc_A
P_B_Ac = 0.01 # False positive
P_Bc_Ac = 1-P_B_Ac

#P(A|B) = P(B|A)*P(A)/P(B)
P_B_A*P_A/(P_B_A*P_A + P_B_Ac*P_Ac)

## [1] 0.3313172

#P(A^c|B^c) = P(B^c|A^c)*P(A^c)/P(B^c)
P_Bc_Ac*P_Ac / (P_Bc_A*P_A + P_Bc_Ac*P_Ac)

## [1] 0.9999289

89595/90500

## [1] 0.99