Homework 2 Problem 1. Dice Rolls If you roll a pair of fair dice, what is the probability of..

dice_sum <- c(2,3,4,5,6,7,8,9,10,11,12)
dice_roll_prob <- c(1,2,3,4,5,6,5,4,3,2,1)/36
df <- data.frame(dice_sum, dice_roll_prob)
names(df) <- c("Sums", "Probability")
df
##    Sums Probability
## 1     2  0.02777778
## 2     3  0.05555556
## 3     4  0.08333333
## 4     5  0.11111111
## 5     6  0.13888889
## 6     7  0.16666667
## 7     8  0.13888889
## 8     9  0.11111111
## 9    10  0.08333333
## 10   11  0.05555556
## 11   12  0.02777778
# (a) getting a sum of 1?
is.element('0',dice_roll_prob)
## [1] FALSE
  1. getting a sum of 1?
print("Probability is Zero for sum of 1 on sum of two dies. Minimum sum would be 2.")
## [1] "Probability is Zero for sum of 1 on sum of two dies. Minimum sum would be 2."
  1. getting a sum of 5?
y<- df[4,]
y
##   Sums Probability
## 4    5   0.1111111
  1. getting a sum of 12?
z<- df[11,]
z
##    Sums Probability
## 11   12  0.02777778

Problem 2. School absences Data collected at elementary schools in DeKalb County, GA suggest that each year roughly 25% of students miss exactly one day of school, 15% miss 2 days, and 28% miss 3 or more days due to sickness.

  1. What is the probability that a student chosen at random doesn’t miss any days of school due to sickness this year?
Stu_No_Mised <- 1 - (.25+.15+.28)
Stu_No_Mised
## [1] 0.32
# probability didn't miss school due to sickness
  1. What is the probability that a student chosen at random misses no more than one day?
Stu_Miss1<- 1- (.15+.28)
Stu_Miss1
## [1] 0.57
  1. What is the probability that a student chosen at random misses at least one day?
Stu_w_Miss <- .25+.15+.28
Stu_w_Miss
## [1] 0.68

is the probability that a least one day missed.

  1. If a parent has two kids at a DeKalb County elementary school, what is the probability that neither kid will miss any school? Note any assumption you must make to answer this question. P(A and B) = P(A) x P(B) Assumption: Multiplication Rule of One child not missing school in Ans A was 0.32 and we use same probability for second student (B).
Neither_Misses <- .32 * .32
Neither_Misses
## [1] 0.1024

Probabiity that neither misses school.

  1. If a parent has two kids at a DeKalb County elementary school, what is the probability that both kids will miss some school, i.e. at least one day? Note any assumption you make. Assumption: We use the probability from (c) 0.68 and the multiplication Rule P(A and B) = P(A) x P(B)
Both_Miss <- .68 * .68
Both_Miss
## [1] 0.4624

is the probability that both kids will miss some school

  1. If you made an assumption in part (d) or (e), do you think it was reasonable? If you didn’t make any assumptions, double check your earlier answers.

Yes. Assumptiuons are reasonable.

Problem 3. Health coverage, relative frequencies The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey designed to identify risk factors in the adult population and report emerging health trends. The following table displays the distribution of health status of respondents to this survey (excellent, very good, good, fair, poor) and whether or not they have health insurance.

mat=matrix(c(.023, 0.0364, 0.0427, 0.0192, 0.0050,0.2099, 0.3123 ,0.2410 ,0.0817,0.0289), byrow=TRUE, nrow=2)
colnames(mat)=c("Excellent", "Very Good","Good", "Fair","Poor")
rownames(mat)=c("No Coverage","Coverage")
mat
##             Excellent Very Good   Good   Fair   Poor
## No Coverage    0.0230    0.0364 0.0427 0.0192 0.0050
## Coverage       0.2099    0.3123 0.2410 0.0817 0.0289

Excellent Very Good Good Fair Poor

No Coverage 0.0230 0.0364 0.0427 0.0192 0.0050

Coverage 0.2099 0.3123 0.2410 0.0817 0.0289

  1. Are being in excellent health and having health coverage mutually exclusive? Answer (a) No
Not_Mutually_Ex <- mat[2,1]
Not_Mutually_Ex
## [1] 0.2099

have excellent health AND health coverage

  1. What is the probability that a randomly chosen individual has excellent health?
Ans_b <- mat[1,1] + mat[2,1] 
Ans_b
## [1] 0.2329

is the probabiity of a randon individual having excellent health.

  1. What is the probability that a randomly chosen individual has excellent health given that he has health coverage?
Sum_w_Insurance<- mat[2,1]+mat[2,2]+ mat[2,3]+ mat[2,4]+ mat[2,5]
Excel_w_Ins <- mat[2,1]
Ans_c <- Excel_w_Ins/Sum_w_Insurance
Ans_c 
## [1] 0.2402152

is the probability that a random person chosen has insurance and excellent health.

  1. What is the probability that a randomly chosen individual has excellent health given that he doesn’t have health coverage?
Sum_wo_Insurance<- mat[1,1]+mat[1,2]+ mat[1,3]+ mat[1,4]+ mat[1,5]
#Sum_wo_Insurance
Excel_wo_Ins <- mat[1,1]
#Excel_wo_Ins
Ans_d <- Excel_wo_Ins/Sum_wo_Insurance
Ans_d 
## [1] 0.1821061

is the probability person chosen doesn’t have health coverage and excellent health.

  1. Do having excellent health and having health coverage appear to be independent?
total_w_Insurance<- mat[2,1]+mat[2,2]+ mat[2,3]+ mat[2,4]+ mat[2,5]
#total_w_Insurance
total_w_Exc_Health = mat[1,1] + mat[2,1]
#total_w_Exc_Health
ExcelHealthMultIns <- total_w_Insurance * total_w_Exc_Health
ExcelHealthMultIns
## [1] 0.203508
Excel_w_Ins <- mat[2,1]
print('are not equal')
## [1] "are not equal"
Excel_w_Ins
## [1] 0.2099
print('so the two events are NOT independent')
## [1] "so the two events are NOT independent"

Not independent as probability of both figures would have equaled each other.

Problem 4. Exit Poll. Edison Research gathered exit poll results from several sources for the Wisconsin recall election of Scott Walker. They found that 53% of the respondents voted in favor of Scott Walker. Additionally, they estimated that of those who did vote in favor for Scott Walker, 37% had a college degree, while 44% of those who voted against Scott Walker had a college degree. Suppose we randomly sampled a person who participated in the exit poll and found that he had a college degree. What is the probability that he voted in favor of Scott Walker? Answer 4) Assumtion: that if they didn’t vote for Scott Walker then they voted against Scott Walker.

Votedfor <- .53
Votedagainst <-.47
VotedforwCollege <-.37 
VotedagainstwCollege <-.44

ProbforSWwCollege <- (Votedfor * VotedforwCollege) / ((Votedfor * VotedforwCollege) + (Votedagainst*VotedagainstwCollege))
ProbforSWwCollege
## [1] 0.4867213

is the probability voted for Scott Walker with a college degree.

Problem 5. Books on a bookshelf The table below shows the distribution of books on a bookcase based on whether they are nonfiction or fiction and hardcover or paperback.

mymat2=matrix(c(13,59,15,8),nrow=2,byrow=TRUE) colnames(mymat2)=c(“hard”,“paper”) rownames(mymat2)=c(“fiction”,“nonfiction”)

mymat2 ## hard paper ## fiction 13 59 ## nonfiction 15 8 (a) Find the probability of drawing a hardcover book first then a paperback fiction book second when drawing without replacement. The probability of drawing a hardcover book = 28/95 = .2947 = 29.47% the probability of drawing a paperback book fiction = 59/94 = .6276 = 62.76%

FH <-13
NFH <-15
FP <-59
NFP <-8
TotalHBooks<- FH + NFH
#TotalHBooks
TotalPBooks<- FP + NFP
#TotalPBooks
TotAllBooks = sum(c(13,15,59,8))
#TotAllBooks
ProbDrawH <- TotalHBooks/TotAllBooks
#ProbDrawH
#FP
#TotAllBooks
NewTotAllBooks = TotAllBooks -1
#NewTotAllBooks
ProbDrawPF <- FP/NewTotAllBooks
#ProbDrawPF
ProbSecondwoReplace<-ProbDrawH * ProbDrawPF
ProbSecondwoReplace
## [1] 0.1849944

is the probability of the secont time without replacement

  1. Determine the probability of drawing a fiction book first and then a hardcover book second,when drawing without replacement. Answer b:
ProbAllFic <- (FH +FP) / TotAllBooks
#ProbAllFic
ProbHard <- TotalHBooks/ NewTotAllBooks
#ProbHard
ProbFicthenHNoReplace<- ProbAllFic*ProbHard
ProbFicthenHNoReplace
## [1] 0.2257559

is the probability of Fiction First then Hardback wiht no replacement using the multiplation of each probability. (c) Calculate the probability of the scenario in part (b), except this time complete the calculations under the scenario where the first book is placed back on the bookcase before randomly drawing the second book

#TotalHBooks
#TotAllBooks
ProbHardwReplace <- (TotalHBooks/ TotAllBooks)
 ProbHardwReplace
## [1] 0.2947368
 ProbFicthenHYesReplace = ProbAllFic*ProbHardwReplace

The probability is 0.2947368 when replacing the first book. Problem 6. Is it worth it? Andy is always looking for ways to make money fast. Lately, he has been trying to make money by gambling. Here is the game he is considering playing: The game costs 2 dollars to play. He draws a card from a deck. If he gets a number card (2-10), he wins nothing. For any face card (jack, queen or king), he wins 3 dollars. For any ace, he wins 5 dollars and he wins an extra $20 if he draws the ace of clubs. (a) Create a probability model and find Andy’s expected profit per game.

Prob_0_Profit_2to10 <- 36/52
Prob_0_Profit_2to10
## [1] 0.6923077
Prob_1_Profit_JQK<- 12/52
Prob_1_Profit_JQK
## [1] 0.2307692
Prob_3_Profit_Aces <- 4/52
Prob_3_Profit_Aces
## [1] 0.07692308
Prob_18_Profit_AceofClubs <-1/52
Prob_18_Profit_AceofClubs
## [1] 0.01923077
  1. Would you recommend this game to Andy as a good way to make money? Explain.
P <- (-2)*(Prob_0_Profit_2to10)+(1)*(Prob_1_Profit_JQK)+(3)*(Prob_3_Profit_Aces)+(18)*(Prob_18_Profit_AceofClubs)
P
## [1] -0.5769231

This has a negative winning outcome. I would not recommend this game to Andy as a good way to make money.

Problem 7. Scooping ice cream. Ice cream usually comes in 1.5 quart boxes (48 fluid ounces), and ice cream scoops hold about 2 ounces. However, there is some variability in the amount of ice cream in a box as well as the amount of ice cream scooped out. We represent the amount of ice cream in the box as X and the amount scooped out as Y . Suppose these random variables have the following means, standard deviations, and variances:

mymat3=matrix(c(48,1,1, 2,.25,.0625), nrow=2, byrow=TRUE)
colnames(mymat3)=c("mean", "SD", "Var")
rownames(mymat3)=c("X, In Box","Y, Scooped")
mymat3
##            mean   SD    Var
## X, In Box    48 1.00 1.0000
## Y, Scooped    2 0.25 0.0625
  1. An entire box of ice cream, plus 3 scoops from a second box is served at a party. How much ice cream do you expect to have been served at this party? What is the standard deviation of the amount of ice cream served?
Box_served_oz <- 48
Scoop_oz <- 3*2
Total_served <- Box_served_oz + Scoop_oz
Total_served
## [1] 54

is the total Serverd.

SD_of_served <- mymat3[1,2]
SD_of_scoops <- mymat3[2,2]*3
#SD_of_served
#SD_of_scoops
SD_Total <- SD_of_served + SD_of_scoops
SD_Total
## [1] 1.75

is the Standard Deviation of the amount Served.

  1. How much ice cream would you expect to be left in the box after scooping out one scoop of ice cream? That is, find the expected value of X ??? Y . What is the standard deviation of the amount left in the box?
Scoop_rm <- 2
print( mymat3[1,1]-mymat3[1,3] - Scoop_rm)
## [1] 45

expect to be left

scoope_sd <- mymat3[2,3]
#scoope_sd
Box_sd <- mymat3[1,3]
#Box_sd
SD_left_Box <- sqrt(Box_sd+scoope_sd)
SD_left_Box
## [1] 1.030776
  1. Using the context of this exercise, explain why we add variances when we subtract one random variable from another. Ans C: The variance needs to be added to the box as well as the variance to get the total variance when combining.