Problem 1. Dice Rolls If you roll a pair of fair dice, what is the probability of..

d1 <- c(1,2,3,4,5,6)
d2 <- c(1,2,3,4,5,6)
p <- 6*6 # p represent the total number of all possible combinations

# (a) getting a sum of 1?
# There are 0 ways we can get a sum of 1 from two dice
p1 <- 0/p 
paste("Probability of getting a sum of 1 is", p1)
## [1] "Probability of getting a sum of 1 is 0"
# (b) getting a sum of 5?
# There are 4 possible ways of getting a sum of 5: (1,4), (4,1), (2,3), (3,2)
p5 <- round(4/p, 2)
paste("Probability of getting a sum of 5 is", p5)
## [1] "Probability of getting a sum of 5 is 0.11"
# (c) getting a sum of 12?
# There is 1 possible way of getting a sum of 12: (6,6)
p12 <- round(1/p, 2)
paste("Probability of getting a sum of 5 is", p12)
## [1] "Probability of getting a sum of 5 is 0.03"

Problem 2. School absence Data collected at elementary schools in DeKalb County, GA suggest that each year roughly 25% of students miss exactly one day of school, 15% miss 2 days, and 28% miss 3 or more days due to sickness.

d1 <- .25 # d1 represents probability of students missing exactly 1 day
d2 <- .15 # d2 represents probability of students missing exactly 2 days
d3 <- .28 # d3 represents probability of students missing 3 days or more

#(a) What is the probability that a student chosen at random doesn't miss any days of school due to sickness this year?
# since d1, d2, d3 exhausted all possibilities that students miss any school days, the remaining outcome is 1 minus (d1+d2+d3)
d0 <- 1-(d1+d2+d3) # d0 represent probability of missing zero days
paste("Probability that a student chosen at random doesn't miss any days is", d0)
## [1] "Probability that a student chosen at random doesn't miss any days is 0.32"
# (b) What is the probability that a student chosen at random misses no more than one day?
# no more than one day meaning 0 or 1 day in this case
paste("Probability that a student chosen at random misses no more than one day is", d0+d1)
## [1] "Probability that a student chosen at random misses no more than one day is 0.57"
# (c) What is the probability that a student chosen at random misses at least one day?
# at least one day meaning 1, 2, 3 or more days in this case
paste("Probability that a student chosen at random misses at least one day is", d1+d2+d3)
## [1] "Probability that a student chosen at random misses at least one day is 0.68"
# (d) If a parent has two kid, what is the probability that neither kid will miss any school?
# probability of kid1 doesn't miss school is d0, probability of kid2 doesn't miss school is also d0, probability of neither kid miss school is the product of the two probabilities
paste("Probability that neither kid will miss any school is", d0*d0)
## [1] "Probability that neither kid will miss any school is 0.1024"
# (e) If a parent has two kids, what is the probability that both kids will miss some school, i.e. at least one day?
miss <- 1-d0 # miss represents the probability of missing at least one day
# probability of both kids miss at least one day is the product of miss times miss
paste("Probability that both kids will miss some school is", miss*miss)
## [1] "Probability that both kids will miss some school is 0.4624"
# (f) If you made an assumption in part(d) or (e), do you think it was reasonable?
# I think my assumptions of missng zero days(d0) and missing some days(miss) are reasonable.

Problem 3. Health coverage, relative frequencies The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey designed to identify risk factors in the adult population and report emerging health trends. The following table displays the distribution of health status of respondents to this survey (excellent, very good, good, fair, poor) and whether or not they have health insurance.

mat=matrix(c(.023, 0.0364, 0.0427, 0.0192, 0.0050,0.2099, 0.3123 ,0.2410 ,0.0817,0.0289), byrow=TRUE, nrow=2)
colnames(mat)=c("Excellent", "Very Good","Good", "Fair","Poor")
rownames(mat)=c("No Coverage","Coverage")
mat
##             Excellent Very Good   Good   Fair   Poor
## No Coverage    0.0230    0.0364 0.0427 0.0192 0.0050
## Coverage       0.2099    0.3123 0.2410 0.0817 0.0289
noCov <- sum(mat[1,]) # probability of no coverage
cov <- sum(mat[2,]) # probability of coverage
excellent <- sum(mat[,1]) # probability of xcellent
veryGood <- sum(mat[,2]) # probability of very good
good <- sum(mat[,3]) # probability of good
fair <- sum(mat[,4]) # probability of fair
poor <- sum(mat[,5]) # probability of poor
# (a) Are being in excellent health and having health coverage mutually exclusive?
# They are mutually exclusive if the probabilities of excellent and excellent given has coverage are the same
eGivenCov <- round(0.2099/cov, 4) # probability of having coverage and excellent health, divided by probability of coverage

paste("Are being in excellent health and having health coverage mutually exclusive?", identical(excellent, eGivenCov))
## [1] "Are being in excellent health and having health coverage mutually exclusive? FALSE"
# (b) What is the probability that a randomly chosen individual has excellent health?
paste("Probability that a randomly chosen individual has excellent health is", excellent)
## [1] "Probability that a randomly chosen individual has excellent health is 0.2329"
# (c) What is the probability that a randomly chosen individual has excellent health given that he has health coverage?
paste("Probability that a randomly chosen individual has excellent health given he has coverage is", eGivenCov)
## [1] "Probability that a randomly chosen individual has excellent health given he has coverage is 0.2402"
# (d) What is the probability that a randomly chosen individual has excellent health given that he doesn't have health coverage?
eGivenNC <- round(0.0230/noCov, 4) # probability of having no coverage and excellent health, divided by probability of no coverage

paste("Probability that a randomly chosen individual has excellent health given he has no coverage is", eGivenNC)
## [1] "Probability that a randomly chosen individual has excellent health given he has no coverage is 0.1821"
# (e) Do having excellent health and having health coverage appear to be independent?
# They are independent if prob(A)*prob(B)=prob(AB)
AtimesB <- excellent*cov
AB <- .2099
paste("Do having excellent health and having health coverage appear to be independent?", identical(AtimesB, AB))
## [1] "Do having excellent health and having health coverage appear to be independent? FALSE"

Problem 4. Exit Poll. Edison Research gathered exit poll results from several sources for the Wisconsin recall election of Scott Walker. They found that 53% of the respondents voted in favor of Scott Walker. Additionally, they estimated that of those who did vote in favor for Scott Walker, 37% had a college degree, while 44% of those who voted against Scott Walker had a college degree. Suppose we randomly sampled a person who participated in the exit poll and found that he had a college degree. What is the probability that he voted in favor of Scott Walker?

vote <- .53  # prob of voting for Walker
cGivenVote <- .37 # prob of having a college degree given that they voted for Walker
voteAndC <- vote*cGivenVote # prob of both having a college degree and voted for Walker

against <- 1-vote # prob of voting against Walker
cGivenAgainst <- .44 # prob of having a college degree given that they voted against Walker
againstAndC <- against*cGivenAgainst # prob of both having a college degree and voted against Walker

paste("Probability that this person who has a college degree voted in favor of Scott Walker is", round(voteAndC/(voteAndC+againstAndC), 4))
## [1] "Probability that this person who has a college degree voted in favor of Scott Walker is 0.4867"

Problem 5. Books on a bookshelf The table below shows the distribution of books on a bookcase based on whether they are nonfiction or fiction and hardcover or paperback.

mymat2=matrix(c(13,59,15,8),nrow=2,byrow=TRUE)
colnames(mymat2)=c("hard","paper")
rownames(mymat2)=c("fiction","nonfiction")


mymat2
##            hard paper
## fiction      13    59
## nonfiction   15     8
book <- 13+59+15+8 # total number of books
hard <- 13+15 # number of hardcover books
paper <- 59+8 # number of papercover books
fiction <- 13+59 # number of fictions
nonfiction <- 15+8 # number of nonfictions

# (a) Find the probability of drawing a hardcover book first then a paperback fiction book second when drawing without replacement.
paste("Probability of drawing a hardcover book first then a paperback fiction book second without replacement is", round(hard/book*59/(book-1), 4))
## [1] "Probability of drawing a hardcover book first then a paperback fiction book second without replacement is 0.185"
# (b) Determine the probability of drawing a fiction book first and then a hardcover book second,when drawing without replacement.
paste("Probability of drawing a fiction book first then a hardcover book second without replacement is", round(fiction/book*hard/(book-1), 4))
## [1] "Probability of drawing a fiction book first then a hardcover book second without replacement is 0.2258"
# (c) Calculate the probability of the scenario in part (b), except this time complete the calculations under the scenario where the first book is placed back on the bookcase before randomly drawing the second book.
paste("Probability of drawing a fiction book first then a hardcover book second with replacement is", round(fiction/book*hard/book, 4))
## [1] "Probability of drawing a fiction book first then a hardcover book second with replacement is 0.2234"
# (d) The final answers to parts (b) and (c) are very similar. Explain why this is the case.
paste("Final answers to parts (b) and (c) are very similar because there is only difference of 1 book to the total of", book, "books in the denominator")
## [1] "Final answers to parts (b) and (c) are very similar because there is only difference of 1 book to the total of 95 books in the denominator"

Problem 6. Is it worth it? Andy is always looking for ways to make money fast. Lately, he has been trying to make money by gambling. Here is the game he is considering playing: The game costs 2 dollars to play. He draws a card from a deck. If he gets a number card (2-10), he wins nothing. For any face card (jack, queen or king), he wins 3 dollars. For any ace, he wins 5 dollars and he wins an extra $20 if he draws the ace of clubs.

# (a) Create a probability model and find Andy's expected profit per game.
deck <- 52 # a deck has 52 cards
numCard <- 9*4/deck # prob of drawing a number card (2-10)
faceCard <- 3*4/deck # prob of drawing a face card
aceCard <- 3/deck # prob of drawing an ace except for ace of clubs
clubAce <- 1/deck # prob of drawing the ace of clubs

profit <- 0*numCard + 3*faceCard + 5*aceCard + (5+20)*clubAce - 2
paste("Andy's expected profit per game is $", round(profit,2))
## [1] "Andy's expected profit per game is $ -0.54"
# (b) Would you recommend this game to Andy as a good way to make money? Explain.
ifelse(profit>0, "I'd recommend this game to Andy because its expected profit is positive. He could've made some money out of it.", "I'd not recommend this game to Andy because its expected profit is negative. He could've lost money.")
## [1] "I'd not recommend this game to Andy because its expected profit is negative. He could've lost money."

Problem 7. Scooping ice cream. Ice cream usually comes in 1.5 quart boxes (48 fluid ounces), and ice cream scoops hold about 2 ounces. However, there is some variability in the amount of ice cream in a box as well as the amount of ice cream scooped out. We represent the amount of ice cream in the box as X and the amount scooped out as Y . Suppose these random variables have the following means, standard deviations, and variances:

mymat3=matrix(c(48,1,1, 2,.25,.0625), nrow=2, byrow=TRUE)
colnames(mymat3)=c("mean", "SD", "Var")
rownames(mymat3)=c("X, In Box","Y, Scooped")
mymat3
##            mean   SD    Var
## X, In Box    48 1.00 1.0000
## Y, Scooped    2 0.25 0.0625
# (a) An entire box of ice cream, plus 3 scoops from a second box is served at a party. How much ice cream do you expect to have been served at this party? What is the standard deviation of the amount of ice cream served?
paste("I expect", 48+3*2, "ounces of ice cream have been served at this party.")
## [1] "I expect 54 ounces of ice cream have been served at this party."
paste("The standard deviation of the amount of ice cream served is", round(sqrt(1+.0625*3),4))
## [1] "The standard deviation of the amount of ice cream served is 1.0897"
# (b) How much ice cream would you expect to be left in the box after scooping out one scoop of ice cream? That is, find the expected value of X ??? Y . What is the standard deviation of the amount left in the box?
paste("I expect", 48-2, "ounces of ice cream left in the box after scooping out one scoop.")
## [1] "I expect 46 ounces of ice cream left in the box after scooping out one scoop."
paste("The standard deviation of the amount left in the box is", round(sqrt(1+0.0625),4))
## [1] "The standard deviation of the amount left in the box is 1.0308"
# (c) Using the context of this exercise, explain why we add variances when we subtract one random variable from another.
print("We add variances when we subtract Y from X because each change in the box(scoop out or scoop in ice cream) is going to affect the stability and thus add more variance to the whole.")
## [1] "We add variances when we subtract Y from X because each change in the box(scoop out or scoop in ice cream) is going to affect the stability and thus add more variance to the whole."