Underage drinking, Part I. (4.17) Data collected by the Substance Abuse and Mental Health Services Administration (SAMSHA) suggests that 69.7% of 18-20 year olds consumed alcoholic beverages in any given year.
(a) Suppose a random sample of ten 18-20 year olds is taken. Is the use of the binomial distribution appropriate for calculating the probability that exactly six consumed alcoholic beverages? Explain.
To check if the binomial model we check the following conditions
Since we are supposing we can treat the 10, 18-20 year olds taken is a random sample, they are independent.
We have a fixed number of trials (n = 10).
Each outcome is a success or failure.The person consumed or did not consume alcohol.
The probability of a success is the same for each trials since the individuals are like a random sample (p = 0.697 if we say a “success” that they consume alcohol).
(b) Calculate the probability that exactly 6 out of 10 randomly sampled 18- 20 year olds consumed an alcoholic drink.
BINOMIAL FORMULA
#p(10 choose 6) binomial formula = 210 * (.697)^6 * (1 -.697)^4
p_success_6 <- 210 * (.697)^6 * (1-.697)^4
p_success_6
## [1] 0.2029488
(c) What is the probability that exactly four out of ten 18-20 year olds have not consumed an alcoholic beverage?
#p(not consumed alcohol) = (1-.697)= 0.303
1-.697
## [1] 0.303
#p(10 choose 4)= 210 * (.303)^4 * (1 -.303)^6
p_notconsumed_4 <- 210 * (.303)^4 * (.697)^6
p_notconsumed_4
## [1] 0.2029488
(d) What is the probability that at most 2 out of 5 randomly sampled 18-20 year olds have consumed alcoholic beverages?
#p(0,or 1, or 2 consumed alcohold) = p(k=0) + p(k=1) + p(k=2)
p_success_atmost2 <- 1^0*(.303)^5 + 5* .697*(.303)^4+ 10* (.697)^2 * (.303)^3
p_success_atmost2
## [1] 0.1670716
(e) What is the probability that at least 1 out of 5 randomly sampled 18-20 year olds have consumed alcoholic beverages?
#p(k=0) or p_none;
p_none <- 1^0*(.303)^5
ans <- 1-p_none
ans
## [1] 0.997446
#or p(k=1)+ p(k=2) + p(k=3)+ p(k=4) + p(k=5)
ans_1 <- 5* .697*(.303)^4 + 10* (.697)^2 * (.303)^3 + 10* (.697)^3 * (.303)^2 + 5 * (.697)^4 * (.303)^1 + (.697)^5
ans_1
## [1] 0.997446
4.19 Underage drinking, Part II. We learned in Exercise 4.17 that about 70% of 18-20 year olds consumed alcoholic beverages in any given year. We now consider a random sample of fifty 18-20 year olds.
(a) How many people would you expect to have consumed alcoholic beverages? And with what standard deviation? #n=50 #psuccess = .697 #mean = np #sd = sqrt(np(1-p)) #for normal approximation < n(1-p) <- 50*(1-.697) #check is atleast 10 for np and n(1-p) to use normal approximation of the binomial distribution to compute the range of possible successes.
check <- 50*(1-.697)
check
## [1] 15.15
mean<- 50*.697
sd <- sqrt(mean * (1-.697))
mean
## [1] 34.85
sd
## [1] 3.249546
(b) Would you be surprised if there were 45 or more people who have consumed alcoholic beverages?
Yes , 45 or more people implies that these are more than 3 standard deviations from the mean so would be unusual so we would be surprised. 68-95-99.7 rule
#n=50
#psuccess = .70
#mean = np
#sd = sqrt(np(1-p))
sd*3+mean
## [1] 44.59864
(c) What is the probability that 45 or more people in this sample have consumed alcoholic beverages? How does this probability relate to your answer to part (b)? Since we used normal approximation, and the checks for np and n(1-p) >= 10, we will apply the cutoff to approximate for the binomial distribution. The normal approximation to the binomial distribution , improves when cutoff values are modified. The cutoff value for the lower end of the shaded region is reduced by 0.5 and calculating pnorm below gives us probability of 0.0015. When we increase the area by reducing the lower bound value,that puts us at 44.5 which is within the 3 * sd over the mean.
z <- (45 - 34.85)/3.25
z
## [1] 3.123077
#at .5 below 45 to apply correction
z2 <- (44.5 - 34.85)/3.25
z2
## [1] 2.969231
#pnorm using z score
pnorm(3.12,0,1,lower.tail=F)
## [1] 0.0009042552
pnorm(z2,0,1,lower.tail = F)
## [1] 0.001492731
#pnorm using observation
pnorm(45,34.85,3.25,lower.tail = F)
## [1] 0.0008948548
pnorm(44.5,34.85,3.25,lower.tail = F)
## [1] 0.001492731
#for normal approximation from z score calculated with value 45 or more alcoholic beverages
DATA606::normalPlot(mean=0,sd=1, bounds = c(3.12, 5))
value <- .000904
#apply .5 correction area from 44.5 to 50 for 45 or more people is 0.00149
DATA606::normalPlot(mean=0,sd=1, bounds = c(2.97, 5))