Homework 2 Problem 1. Dice Rolls If you roll a pair of fair dice, what is the probability of..
dice_sum <- c(2,3,4,5,6,7,8,9,10,11,12)
dice_roll_prob <- c(1,2,3,4,5,6,5,4,3,2,1)/36
df <- data.frame(dice_sum, dice_roll_prob)
names(df) <- c("Sums", "Probability")
df
## Sums Probability
## 1 2 0.02777778
## 2 3 0.05555556
## 3 4 0.08333333
## 4 5 0.11111111
## 5 6 0.13888889
## 6 7 0.16666667
## 7 8 0.13888889
## 8 9 0.11111111
## 9 10 0.08333333
## 10 11 0.05555556
## 11 12 0.02777778
# (a) getting a sum of 1?
is.element('0',dice_roll_prob)
## [1] FALSE
print("Probability is Zero for sum of 1 on sum of two dies. Minimum sum would be 2.")
## [1] "Probability is Zero for sum of 1 on sum of two dies. Minimum sum would be 2."
y<- df[4,]
y
## Sums Probability
## 4 5 0.1111111
z<- df[11,]
z
## Sums Probability
## 11 12 0.02777778
Problem 2. School absences Data collected at elementary schools in DeKalb County, GA suggest that each year roughly 25% of students miss exactly one day of school, 15% miss 2 days, and 28% miss 3 or more days due to sickness.
Stu_No_Mised <- 1 - (.25+.15+.28)
Stu_No_Mised
## [1] 0.32
# probability didn't miss school due to sickness
Stu_Miss1<- 1- (.15+.28)
Stu_Miss1
## [1] 0.57
Stu_w_Miss <- .25+.15+.28
Stu_w_Miss
## [1] 0.68
is the probability that a least one day missed.
Neither_Misses <- .32 * .32
Neither_Misses
## [1] 0.1024
Probabiity that neither misses school.
Both_Miss <- .68 * .68
Both_Miss
## [1] 0.4624
is the probability that both kids will miss some school
Yes. Assumptiuons are reasonable.
Problem 3. Health coverage, relative frequencies The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey designed to identify risk factors in the adult population and report emerging health trends. The following table displays the distribution of health status of respondents to this survey (excellent, very good, good, fair, poor) and whether or not they have health insurance.
mat=matrix(c(.023, 0.0364, 0.0427, 0.0192, 0.0050,0.2099, 0.3123 ,0.2410 ,0.0817,0.0289), byrow=TRUE, nrow=2)
colnames(mat)=c("Excellent", "Very Good","Good", "Fair","Poor")
rownames(mat)=c("No Coverage","Coverage")
mat
## Excellent Very Good Good Fair Poor
## No Coverage 0.0230 0.0364 0.0427 0.0192 0.0050
## Coverage 0.2099 0.3123 0.2410 0.0817 0.0289
Not_Mutually_Ex <- mat[2,1]
Not_Mutually_Ex
## [1] 0.2099
have excellent health AND health coverage
Ans_b <- mat[1,1] + mat[2,1]
Ans_b
## [1] 0.2329
is the probabiity of a randon individual having excellent health.
Sum_w_Insurance<- mat[2,1]+mat[2,2]+ mat[2,3]+ mat[2,4]+ mat[2,5]
Excel_w_Ins <- mat[2,1]
Ans_c <- Excel_w_Ins/Sum_w_Insurance
Ans_c
## [1] 0.2402152
is the probability that a random person chosen has insurance and excellent health.
Sum_wo_Insurance<- mat[1,1]+mat[1,2]+ mat[1,3]+ mat[1,4]+ mat[1,5]
#Sum_wo_Insurance
Excel_wo_Ins <- mat[1,1]
#Excel_wo_Ins
Ans_d <- Excel_wo_Ins/Sum_wo_Insurance
Ans_d
## [1] 0.1821061
is the probability person chosen doesn’t have health coverage and excellent health.
total_w_Insurance<- mat[2,1]+mat[2,2]+ mat[2,3]+ mat[2,4]+ mat[2,5]
#total_w_Insurance
total_w_Exc_Health = mat[1,1] + mat[2,1]
#total_w_Exc_Health
ExcelHealthMultIns <- total_w_Insurance * total_w_Exc_Health
ExcelHealthMultIns
## [1] 0.203508
Excel_w_Ins <- mat[2,1]
print('are not equal')
## [1] "are not equal"
Excel_w_Ins
## [1] 0.2099
print('so the two events are NOT independent')
## [1] "so the two events are NOT independent"
Not independent as probability of both figures would have equaled each other.
Problem 4. Exit Poll. Edison Research gathered exit poll results from several sources for the Wisconsin recall election of Scott Walker. They found that 53% of the respondents voted in favor of Scott Walker. Additionally, they estimated that of those who did vote in favor for Scott Walker, 37% had a college degree, while 44% of those who voted against Scott Walker had a college degree. Suppose we randomly sampled a person who participated in the exit poll and found that he had a college degree. What is the probability that he voted in favor of Scott Walker? Answer 4) Assumtion: that if they didn’t vote for Scott Walker then they voted against Scott Walker.
Votedfor <- .53
Votedagainst <-.47
VotedforwCollege <-.37
VotedagainstwCollege <-.44
ProbforSWwCollege <- (Votedfor * VotedforwCollege) / ((Votedfor * VotedforwCollege) + (Votedagainst*VotedagainstwCollege))
ProbforSWwCollege
## [1] 0.4867213
is the probability voted for Scott Walker with a college degree.
Problem 5. Books on a bookshelf The table below shows the distribution of books on a bookcase based on whether they are nonfiction or fiction and hardcover or paperback.
mymat2=matrix(c(13,59,15,8),nrow=2,byrow=TRUE) colnames(mymat2)=c(“hard”,“paper”) rownames(mymat2)=c(“fiction”,“nonfiction”)
mymat2 ## hard paper ## fiction 13 59 ## nonfiction 15 8 (a) Find the probability of drawing a hardcover book first then a paperback fiction book second when drawing without replacement. The probability of drawing a hardcover book = 28/95 = .2947 = 29.47% the probability of drawing a paperback book fiction = 59/94 = .6276 = 62.76%
FH <-13
NFH <-15
FP <-59
NFP <-8
TotalHBooks<- FH + NFH
#TotalHBooks
TotalPBooks<- FP + NFP
#TotalPBooks
TotAllBooks = sum(c(13,15,59,8))
#TotAllBooks
ProbDrawH <- TotalHBooks/TotAllBooks
#ProbDrawH
#FP
#TotAllBooks
NewTotAllBooks = TotAllBooks -1
#NewTotAllBooks
ProbDrawPF <- FP/NewTotAllBooks
#ProbDrawPF
ProbSecondwoReplace<-ProbDrawH * ProbDrawPF
ProbSecondwoReplace
## [1] 0.1849944
is the probability of the secont time without replacement
ProbAllFic <- (FH +FP) / TotAllBooks
#ProbAllFic
ProbHard <- TotalHBooks/ NewTotAllBooks
#ProbHard
ProbFicthenHNoReplace<- ProbAllFic*ProbHard
ProbFicthenHNoReplace
## [1] 0.2257559
is the probability of Fiction First then Hardback wiht no replacement using the multiplation of each probability. (c) Calculate the probability of the scenario in part (b), except this time complete the calculations under the scenario where the first book is placed back on the bookcase before randomly drawing the second book
#TotalHBooks
#TotAllBooks
ProbHardwReplace <- (TotalHBooks/ TotAllBooks)
ProbHardwReplace
## [1] 0.2947368
ProbFicthenHYesReplace = ProbAllFic*ProbHardwReplace
The probability is 0.2947368 when replacing the first book. Problem 6. Is it worth it? Andy is always looking for ways to make money fast. Lately, he has been trying to make money by gambling. Here is the game he is considering playing: The game costs 2 dollars to play. He draws a card from a deck. If he gets a number card (2-10), he wins nothing. For any face card (jack, queen or king), he wins 3 dollars. For any ace, he wins 5 dollars and he wins an extra $20 if he draws the ace of clubs. (a) Create a probability model and find Andy’s expected profit per game.
Prob_0_Profit_2to10 <- 36/52
Prob_0_Profit_2to10
## [1] 0.6923077
Prob_1_Profit_JQK<- 12/52
Prob_1_Profit_JQK
## [1] 0.2307692
Prob_3_Profit_Aces <- 4/52
Prob_3_Profit_Aces
## [1] 0.07692308
Prob_18_Profit_AceofClubs <-1/52
Prob_18_Profit_AceofClubs
## [1] 0.01923077
P <- (-2)*(Prob_0_Profit_2to10)+(1)*(Prob_1_Profit_JQK)+(3)*(Prob_3_Profit_Aces)+(18)*(Prob_18_Profit_AceofClubs)
P
## [1] -0.5769231
This has a negative winning outcome. I would not recommend this game to Andy as a good way to make money.
Problem 7. Scooping ice cream. Ice cream usually comes in 1.5 quart boxes (48 fluid ounces), and ice cream scoops hold about 2 ounces. However, there is some variability in the amount of ice cream in a box as well as the amount of ice cream scooped out. We represent the amount of ice cream in the box as X and the amount scooped out as Y . Suppose these random variables have the following means, standard deviations, and variances:
mymat3=matrix(c(48,1,1, 2,.25,.0625), nrow=2, byrow=TRUE)
colnames(mymat3)=c("mean", "SD", "Var")
rownames(mymat3)=c("X, In Box","Y, Scooped")
mymat3
## mean SD Var
## X, In Box 48 1.00 1.0000
## Y, Scooped 2 0.25 0.0625
Box_served_oz <- 48
Scoop_oz <- 3*2
Total_served <- Box_served_oz + Scoop_oz
Total_served
## [1] 54
is the total Serverd.
SD_of_served <- mymat3[1,2]
SD_of_scoops <- mymat3[2,2]*3
#SD_of_served
#SD_of_scoops
SD_Total <- SD_of_served + SD_of_scoops
SD_Total
## [1] 1.75
is the Standard Deviation of the amount Served.
Scoop_rm <- 2
print( mymat3[1,1]-mymat3[1,3] - Scoop_rm)
## [1] 45
expect to be left
scoope_sd <- mymat3[2,3]
#scoope_sd
Box_sd <- mymat3[1,3]
#Box_sd
SD_left_Box <- sqrt(Box_sd+scoope_sd)
SD_left_Box
## [1] 1.030776