This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
#Problem 1. Dice Rolls
#If you roll a pair of fair dice, what is the probability of..
#(a) getting a sum of 1?
#The minimum number of dice is 1, so two dics are impossible to get summary of 1
sum1<-0
sum1
## [1] 0
#(b) getting a sum of 5?
#The combination to get summary 5 is (1,4), (2,3),(3,2)(4,1). And every number has 1/6 possiblity
sum2<-((1/6)*(1/6))*4
sum2
## [1] 0.1111111
#(c) getting a sum of 12?
#The combination to get summary 12 is (6,6).
sum3<-(1/6*1/6)
sum3
## [1] 0.02777778
#Problem 2. School absences
#Data collected at elementary schools in DeKalb County, GA suggest that each year roughly 25% of students miss exactly one day of school, 15% miss 2 days, and 28% miss 3 or more days due to sickness.
#(a) What is the probability that a student chosen at random doesn’t miss any days of school due to sickness this year?
p0=(1-0.25-0.15-0.28)
p0
## [1] 0.32
#(b) What is the probability that a student chosen at random misses no more than one day?
p1=p0+0.25
p1
## [1] 0.57
#(c) What is the probability that a student chosen at random misses at least one day?
p2=1-p0
p2
## [1] 0.68
#(d) If a parent has two kids at a DeKalb County elementary school, what is the probability that neither kid will miss any school? Note any assumption you must make to answer this question.
p3=p0*p0
p3
## [1] 0.1024
#(e) If a parent has two kids at a DeKalb County elementary school, what is the probability that both kids will miss some school, i.e. at least one day? Note any assumption you make.
p4=p2*p2
p4
## [1] 0.4624
#(f) If you made an assumption in part (d) or (e), do you think it was reasonable? If you didn’t make any assumptions, double check your earlier answers.
#Answer: Kid might miss school altogether, so it is dependent event. Question e don't make any assumption will get incorrect answer.
#Problem 3. Health coverage, relative frequencies
#The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey designed to identify risk factors in the adult population and report emerging health trends. The following table displays the distribution of health status of respondents to this survey (excellent, very good, good, fair, poor) and whether or not they have health insurance.
mat=matrix(c(.023, 0.0364, 0.0427, 0.0192, 0.0050,0.2099, 0.3123 ,0.2410 ,0.0817,0.0289), byrow=TRUE, nrow=2)
colnames(mat)=c("Excellent", "Very Good","Good", "Fair","Poor")
rownames(mat)=c("No Coverage","Coverage")
mat
## Excellent Very Good Good Fair Poor
## No Coverage 0.0230 0.0364 0.0427 0.0192 0.0050
## Coverage 0.2099 0.3123 0.2410 0.0817 0.0289
#(a) Are being in excellent health and having health coverage mutually exclusive?
#No.mutually exclusive is both event cannot occur at the same time, that is, their intersection is empty.The excellent health rate of those people who have health coverage is 20.99%. So they are dependent.
#(b) What is the probability that a randomly chosen individual has excellent health?
pb<-.0230+.2099
pb
## [1] 0.2329
#(c) What is the probability that a randomly chosen individual has excellent health given that he has health coverage?
pc=0.2099/(0.2099+0.3123+0.2410+0.0817+0.0289)
pc
## [1] 0.2402152
#(d) What is the probability that a randomly chosen individual has excellent health given that he doesn’t have health coverage?
pd=0.0230/(0.0230+0.0364+0.0427+0.0192+0.0050)
pd
## [1] 0.1821061
#(e) Do having excellent health and having health coverage appear to be independent?
e="No. Randomly chosen individual has excellent health given that he has health coverage is 24.02%, but randomly chosen individual has excellent health given that he doesn’t have health coverage is 18.21%. Two number is not equal. So it means health coverage change the probability."
e
## [1] "No. Randomly chosen individual has excellent health given that he has health coverage is 24.02%, but randomly chosen individual has excellent health given that he doesn’t have health coverage is 18.21%. Two number is not equal. So it means health coverage change the probability."
#Problem 4. Exit Poll.
#Edison Research gathered exit poll results from several sources for the Wisconsin recall election of Scott Walker. They found that 53% of the respondents voted in favor of Scott Walker. Additionally, they estimated that of those who did vote in favor for Scott Walker, 37% had a college degree, while 44% of those who voted against Scott Walker had a college degree. Suppose we randomly sampled a person who participated in the exit poll and found that he had a college degree. What is the probability that he voted in favor of Scott Walker?
#Vote againest SW
PNot=1-0.53
PNot
## [1] 0.47
#Voted Scott Walker given college degree
PCD=0.53*0.37
PCD
## [1] 0.1961
#Voted againest Scott Walker given college degree
PCA=PNot*0.44
PCA
## [1] 0.2068
#Have college degree and voted in favor of Scott Walker
Result=PCD/(PCD+PCA)
Result
## [1] 0.4867213
#Problem 5. Books on a bookshelf
#The table below shows the distribution of books on a bookcase based on whether they are nonfiction or fiction and hardcover or paperback.
mymat2=matrix(c(13,59,15,8),nrow=2,byrow=TRUE)
colnames(mymat2)=c("hard","paper")
rownames(mymat2)=c("fiction","nonfiction")
mymat2
## hard paper
## fiction 13 59
## nonfiction 15 8
#(a) Find the probability of drawing a hardcover book first then a paperback fiction book second when drawing without replacement.
#The total number of books
total=13+15+59+8
total
## [1] 95
#two events are independt, so just times the probability of two procedures
Pa=((13+15)/total)*(59/(total-1))
Pa
## [1] 0.1849944
#(b) Determine the probability of drawing a fiction book first and then a hardcover book second,when drawing without replacement.
#fiction number=13+59=72
#Hardcover number=13+15=28
#total books=13+15+59+8=95
pb=(72/95)*(28/94)
pb
## [1] 0.2257559
#(c) Calculate the probability of the scenario in part (b), except this time complete the calculations under the scenario where the first book is placed back on the bookcase before randomly drawing the second book.
pc=(72/95)*(28/95)
pc
## [1] 0.2233795
#(d) The final answers to parts (b) and (c) are very similar. Explain why this is the case.
d=" Question b calculate drawing without replacement, so the second time the total number of book decrease to 94. Question c put back the book, so the total number of books never change for second drawing which will make the denominator bigger than question b, so the probability is smaller."
d
## [1] " Question b calculate drawing without replacement, so the second time the total number of book decrease to 94. Question c put back the book, so the total number of books never change for second drawing which will make the denominator bigger than question b, so the probability is smaller."
#Problem 6. Is it worth it?
#Andy is always looking for ways to make money fast. Lately, he has been trying to make money by gambling. Here is the game he is considering playing: The game costs 2 dollars to play. He draws a card from a deck. If he gets a number card (2-10), he wins nothing. For any face card (jack, queen or king), he wins 3 dollars. For any ace, he wins 5 dollars and he wins an extra $20 if he draws the ace of clubs.
#(a) Create a probability model and find Andy’s expected profit per game.
#probability of number
PN=(9*4)/52
PN
## [1] 0.6923077
#probability of face Card
PFC=(4*3)/52
PFC
## [1] 0.2307692
#probability of A
PA=4/52
PA
## [1] 0.07692308
##probability of A of Club
PAC=1/52
PA
## [1] 0.07692308
#Profit
profit=0*PN+3*PFC+5*PA+20*PAC
profit
## [1] 1.461538
#(b) Would you recommend this game to Andy as a good way to make money? Explain.
#Answer: I don't recommend because Andy invest $2 dollars but only earn $1.46, which is a lost considering his investment.
#Problem 7. Scooping ice cream.
#Ice cream usually comes in 1.5 quart boxes (48 fluid ounces), and ice cream scoops hold about 2 ounces. However, there is some variability in the amount of ice cream in a box as well as the amount of ice cream scooped out. We represent the amount of ice cream in the box as X and the amount scooped out as Y . Suppose these random variables have the following means, standard deviations, and variances:
mymat3=matrix(c(48,1,1, 2,.25,.0625), nrow=2, byrow=TRUE)
colnames(mymat3)=c("mean", "SD", "Var")
rownames(mymat3)=c("X, In Box","Y, Scooped")
mymat3
## mean SD Var
## X, In Box 48 1.00 1.0000
## Y, Scooped 2 0.25 0.0625
#(a) An entire box of ice cream, plus 3 scoops from a second box is served at a party. How much ice cream do you expect to have been served at this party? What is the standard deviation of the amount of ice cream served?
#expect to have been served at this party
expect=48+2*3
expect
## [1] 54
#standard deviation of the amount of ice cream served
SD=sqrt(1^2*1+3^2*0.0625)
SD
## [1] 1.25
#(b) How much ice cream would you expect to be left in the box after scooping out one scoop of ice cream? That is, find the expected value of X ??? Y . What is the standard deviation of the amount left in the box?
#How much ice cream would you expect to be left in the box after scooping out one scoop of ice cream
left=48-2
left
## [1] 46
#standard deviation of the amount left in the box
SD2=sqrt(1^2*1+1^2*0.0625)
SD2
## [1] 1.030776
#(c) Using the context of this exercise, explain why we add variances when we subtract one random variable from another.
#Answer:
c="When we scoop out the icecream, the left amount change and need to adjust the variances."
c
## [1] "When we scoop out the icecream, the left amount change and need to adjust the variances."
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.