Problem 1. Dice Rolls

If you roll a pair of fair dice, what is the probability of..

  1. getting a sum of 1?
  2. getting a sum of 5?
  3. getting a sum of 12?

Known values: With two dice showing 6 possible values each, the total possible values is d1 * d2 = 36

## [1] 1a. With two dice the minimum sum is >=2, 0/36
## [1] 1b. 4 possible rolls generate a total of 5 = 4/36 = 11.11%
## [1] 1c. 1 possible roll generate a total of 12 = 1/36 = 2.78%

Problem 2. School absences

Data collected at elementary schools in DeKalb County, GA suggest that each year roughly 25% of students miss exactly one day of school, 15% miss 2 days, and 28% miss 3 or more days due to sickness.

(a) What is the probability that a student chosen at random doesn’t miss any days of school due to sickness this year?

Q2_one <- .25
Q2_two <- .15
Q2_3p <- .28

Q2_a<- 1-(Q2_one + Q2_two + Q2_3p)
Q2_a<- paste(round(Q2_a*100,digits=1),"%")
print(Q2_a)
## [1] "32 %"

(b) What is the probability that a student chosen at random misses no more than one day?

Q2_b <- 1-(Q2_two + Q2_3p)
Q2_b<- paste(round(Q2_b*100,digits=1),"%")
print(Q2_b)
## [1] "57 %"

(c) What is the probability that a student chosen at random misses at least one day?

Q2_c <- (Q2_one + Q2_two + Q2_3p)
Q2_c<- paste(round(Q2_c*100,digits=1),"%")
print(Q2_c)
## [1] "68 %"

(d) If a parent has two kids at a DeKalb County elementary school, what is the probability that neither kid will miss any school? Note any assumption you must make to answer this question.

P(child1 = no miss) n P(child2 = no miss) Assume each child will miss zero days equaling answer of 2a x 2

Q2_da<- 1-(Q2_one + Q2_two + Q2_3p)
Q2_d<- Q2_da * Q2_da
Q2_d<- paste(round(Q2_d*100,digits=2),"%")
print(Q2_d)
## [1] "10.24 %"

(e) If a parent has two kids at a DeKalb County elementary school, what is the probability that both kids will miss some school, i.e. at least one day? Note any assumption you make.

P(Child1 miss >= 1) n P(Child2 miss >=1)

Q2_ea<-(Q2_one + Q2_two + Q2_3p)
Q2_e<- Q2_ea * Q2_ea
Q2_e<- paste(round(Q2_e*100,digits=2),"%")
print(Q2_e)
## [1] "46.24 %"

(f) If you made an assumption in part (d) or (e), do you think it was reasonable? If you didn’t make any assumptions, double check your earlier answers.

Assume each child will miss at least one or more days equaling answer of 2b times 2.

Problem 3. Health coverage, relative frequencies

The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey designed to identify risk factors in the adult population and report emerging health trends. The following table displays the distribution of health status of respondents to this survey (excellent, very good, good, fair, poor) and whether or not they have health insurance.

mat=matrix(c(.023, 0.0364, 0.0427, 0.0192, 0.0050,0.2099, 0.3123 ,0.2410 ,0.0817,0.0289), byrow=TRUE, nrow=2)
colnames(mat)=c("Excellent", "Very Good","Good", "Fair","Poor")
rownames(mat)=c("No Coverage","Coverage")
mat
##             Excellent Very Good   Good   Fair   Poor
## No Coverage    0.0230    0.0364 0.0427 0.0192 0.0050
## Coverage       0.2099    0.3123 0.2410 0.0817 0.0289

(a) Are being in excellent health and having health coverage mutually exclusive?

No, they are not mutually exclusive. you can have very good and good health while having coverage.

(b) What is the probability that a randomly chosen individual has excellent health?

Q3b = (.0230 + 0.2099)/1
Q3b<- paste(round(Q3b*100,digits=2),"%")
print(Q3b)
## [1] "23.29 %"

(c) What is the probability that a randomly chosen individual has excellent health given that he has health coverage?

E = Excellent Health
H = Health Coverage (sum of all with coverage)

P(E|H) = P(H n E)/P(H)

rowSums(mat)
## No Coverage    Coverage 
##      0.1263      0.8738
Q3c <- .2099/.8738
Q3c<- paste(round(Q3c*100,digits=2),"%")
print(Q3c)
## [1] "24.02 %"

(d) What is the probability that a randomly chosen individual has excellent health given that he doesn’t have health coverage?

E = Excellent Health
Nc = Health Coverage (sum of all with coverage)

P(E|Nc) = P(Nc n E)/P(Nc)

rowSums(mat)
## No Coverage    Coverage 
##      0.1263      0.8738
Q3d <- 0.0230/0.1263
Q3d<- paste(round(Q3d*100,digits=2),"%")
print(Q3d)
## [1] "18.21 %"

(e) Do having excellent health and having health coverage appear to be independent?

By the Chart, they appear to be dependant on each other. You are more likly to have excellent health if you have coverage.

Problem 4. Exit Poll.

Edison Research gathered exit poll results from several sources for the Wisconsin recall election of Scott Walker. They found that 53% of the respondents voted in favor of Scott Walker. Additionally, they estimated that of those who did vote in favor for Scott Walker, 37% had a college degree, while 44% of those who voted against Scott Walker had a college degree. Suppose we randomly sampled a person who participated in the exit poll and found that he had a college degree. What is the probability that he voted in favor of Scott Walker?

Known Variables:
In favor (SW) = 53% (.53)
_Against (ASW)= 47% (.47)*assumption (1-(In Favor))_
For|college (FC) = 37% (.37)
Against|College (AC) = 44% (.44)

Assumptions:
100% of voters vast a vote for or against Scott Walker

SW = .53
ASW = .47
FC = .37
AC = .44

Assigning Values to Varibles Voted for Scott with College degree P(SW|C) = P(SW n C)/1

SW_C <- (SW * FC)/1

Voted against Scott with College P(ASW|AC) = P(ASW n AC)/1

ASW_C <- (ASW * AC)/1 

Has College degree and voted for Scott P(SW_c|SW) = SW_C/ (SW_C + ASW_C)

Q4a <- SW_C/(SW_C+ASW_C)
Q4a<- paste(round(Q4a*100,digits=2),"%")
## [1] "probability that he voted in favor of Scott Walker is 48.67 %"

Problem 5. Books on a bookshelf

The table below shows the distribution of books on a bookcase based on whether they are nonfiction or fiction and hardcover or paperback.

mymat2=matrix(c(13,59,15,8),nrow=2,byrow=TRUE)
colnames(mymat2)=c("hard","paper")
rownames(mymat2)=c("fiction","nonfiction")
mymat2
##            hard paper
## fiction      13    59
## nonfiction   15     8
colSums(mymat2)
##  hard paper 
##    28    67
rowSums(mymat2)
##    fiction nonfiction 
##         72         23

(a) Find the probability of drawing a hardcover book first then a paperback fiction book second when drawing without replacement.

Atotal= 95
Ftotal = 72
Htotal = 28
Ptotal = 23
Pftotal = 59

Probability of Hard Cover Fiction

HC_F = Htotal/Atotal

Probability of Paper back Fiction P(PB_F|HC_F)-1

PB_F<- Pftotal/(Atotal-1)

Answer is

Q5a <- HC_F*PB_F
Q5a<- paste(round(Q5a*100,digits=2),"%")
## [1] "Probability of pulling Hard Cover Fiction followed by Paperback fiction is 18.5 %"

(b) Determine the probability of drawing a fiction book first and then a hardcover book second,when drawing without replacement.

Q5b <-(Ftotal/Atotal)*(Htotal/(Atotal-1))
Q5b<- paste(round(Q5b*100,digits=2),"%")
print(paste ("probability of drawing a fiction book first and then a hardcover book second (with no replacement) is", Q5b))

(c) Calculate the probability of the scenario in part (b), except this time complete the calculations under the scenario where the first book is placed back on the bookcase before randomly drawing the second book.

Q5c <- (Ftotal/Atotal)*(Htotal/Atotal)
Q5c<- paste(round(Q5c*100,digits=2),"%")
## [1] "probability of drawing a fiction book first and then a hardcover book second (with replacement) is 22.34 %"
  1. The final answers to parts (b) and (c) are very similar. Explain why this is the case.

The size of data pool has an effect on the difference pulling one item from the pool. Very small pools have large effects on one item; larger pools have smaller effects.

Problem 6. Is it worth it?

Andy is always looking for ways to make money fast. Lately, he has been trying to make money by gambling. Here is the game he is considering playing: The game costs 2 dollars to play. He draws a card from a deck. If he gets a number card (2-10), he wins nothing. For any face card (jack, queen or king), he wins 3 dollars. For any ace, he wins 5 dollars and he wins an extra $20 if he draws the ace of clubs.

Known Variables: Play = 2.00
Card 2-10 = 0.00 = No Profit
36 possible cards (36/52 = )
Card (J,Q,K) = 3.00 = 1.00 Profit
12 possible Cards (12/52)
Card (Ace) = 5.00 = 2.00 Profit
4 possible Cards (4/52)
Card (Ace_Clubs) = 20.00 = 18.00 Profit
1 possible card (1/52)

(a) Create a probability model and find Andy’s expected profit per game.

num_card = 36/52
num_profit = 0
face_card = 12/52
face_profit = 3
ace_card = 4/52
ace_profit = 5
ace_club = 1/52
club_profit = 20
Q6a <- (num_card*num_profit)+(face_card*face_profit)+(ace_card*ace_profit)+(ace_card*ace_profit)
## [1] "Andy’s expected profit per game is 1.46153846153846"

(b) Would you recommend this game to Andy as a good way to make money? Explain.

Andy is likely to make $1.46 per play while spending $2.00. He is incurring a possible loss of $.54 or over 1/3 his game cost every play.

I would not recommend this game

Problem 7. Scooping ice cream.

Ice cream usually comes in 1.5 quart boxes (48 fluid ounces), and ice cream scoops hold about 2 ounces. However, there is some variability in the amount of ice cream in a box as well as the amount of ice cream scooped out. We represent the amount of ice cream in the box as X and the amount scooped out as Y . Suppose these random variables have the following means, standard deviations, and variances:

Known Varibles:
48 oz per box
scoop = 2 oz

mymat3=matrix(c(48,1,1, 2,.25,.0625), nrow=2, byrow=TRUE)
colnames(mymat3)=c("mean", "SD", "Var")
rownames(mymat3)=c("X, In Box","Y, Scooped")
mymat3
##            mean   SD    Var
## X, In Box    48 1.00 1.0000
## Y, Scooped    2 0.25 0.0625
box_oz = 48
scoop_oz = 2

(a) An entire box of ice cream, plus 3 scoops from a second box is served at a party. How much ice cream do you expect to have been served at this party? What is the standard deviation of the amount of ice cream served?

Q7a_1 <- box_oz+(3*scoop_oz)
var_7a = 1+(.25*3)
sd_7a = sqrt(var_7a)
## [1] "The amount served at the party would be 54 oz"
## [1] "The SD of the served at the party would be 1.3228756555323"

(b) How much ice cream would you expect to be left in the box after scooping out one scoop of ice cream? That is, find the expected value of X ??? Y . What is the standard deviation of the amount left in the box?

Q7_b1 <- box_oz-scoop_oz
q7_b2 <- sqrt(1-.0625)
## [1] "The amount remaining would be 46 oz"
## [1] "The arithmetic mean would remain at 2"
## [1] "The SD for the amount left would be 0.968245836551854"

(c) Using the context of this exercise, explain why we add variances when we subtract one random variable from another.

This practice takes into account the imperfect scoop. The Server would not always be able to take exactly 2oz; our selection is randomly sampled. The variance increases because the varibility of the amount remaining in the box increases while we are removing samples from it verses when we first started with an known volume.