3.6.4 Binomial distribution

3.27 Underage drinking, Part I. The Substance Abuse and Mental Health Services Administration estimated that 70% of 18-20 year olds consumed alcoholic beverages in 2008.
(a) Suppose a random sample of ten 18-20 year olds is taken. Is the use of the binomial distribution appropriate for calculating the probability that exactly six consumed alcoholic beverages? Explain.
1. The trials are independent (SRS)
2. The number of trials is fixed at 10.
3. The outcomes are either drank or not. Success is that they drank.
4. Yes.

  1. Calculate the probability that exactly 6 out of 10 randomly sampled 18-20 year olds consumed an alcoholic drink.
dbinom(6,10,.7)
## [1] 0.2001209
  1. What is the probability that exactly four out of the ten 18-20 year olds have not consumed an alcoholic beverage?
dbinom(4,10,.3)
## [1] 0.2001209
  1. What is the probability that at most 2 out of 5 randomly sampled 18-20 year olds have consumed alcoholic beverages?
pbinom(2,5,0.7)
## [1] 0.16308
  1. What is the probability that at least 1 out of 5 randomly sampled 18-20 year olds have consumed alcoholic beverages?
1-dbinom(0,5,0.7)
## [1] 0.99757
3.28

Chickenpox, Part I. The National Vaccine Information Center estimates that 90% of Americans have had chickenpox by the time they reach adulthood.
(a) Suppose we take a random sample of 100 American adults. Is the use of the binomial distribution appropriate for calculating the probability that exactly 97 had chickenpox before they reached adulthood? Explain.
1. SRS
2. The number of trials is fixed to 100.
3. Either they had chickenpox or not.
4. We’re sticking with yes.

  1. Calculate the probability that exactly 97 out of 100 randomly sampled American adults had chickenpox during childhood.
dbinom(97,100,.9)
## [1] 0.005891602
  1. What is the probability that exactly 3 out of a new sample of 100 American adults have not had chickenpox in their childhood?
dbinom(3,100,0.1)
## [1] 0.005891602
  1. What is the probability that at least 1 out of 10 randomly sampled American adults have had chickenpox?
1-dbinom(0,10,.9)
## [1] 1
  1. What is the probability that at most 3 out of 10 randomly sampled American adults have not had chickenpox?
pbinom(3,10,.1) 
## [1] 0.9872048
Exercise 3.29

Underage drinking, Part II. We learned in Exercise 3.27 that about 70% of 18-20 year olds consumed alcoholic beverages in 2008. We now consider a random sample of fifty 18-20 year olds.

  1. How many people would you expect to have consumed alcoholic beverages? And with what standard deviation?
50*0.7
## [1] 35
sqrt(50*.7*.3)
## [1] 3.24037

We would expect about 35 to drink with a standard deviation of 3.24.

  1. Would you be surprised if there were 45 or more people who have consumed alcoholic beverages?

Yes, it would be unusual because the probability of 45 or more people consuming alcoholic beverages is .07%.

1-pbinom(44,50,.7)
## [1] 0.0007228617
  1. What is the probability that 45 or more people in this sample have consumed alcoholic beverages?

.07%

(45-35)/3.24
## [1] 3.08642

How does this probability relate to your answer to part (b)

Exercise 3.30

Chickenpox, Part II. We learned in Exercise 3.28 that about 90% of American adults had chickenpox before adulthood. We now consider a random sample of 120 American adults.
(a) How many people in this sample would you expect to have had chickenpox in their childhood? And with what standard deviation?

120*.9
## [1] 108
sqrt(120*.9*.1)
## [1] 3.286335
  1. Would you be surprised if there were 105 people who have had chickenpox in their childhood?
pbinom(105,120,.9)
## [1] 0.2181634

No, because the probability is over 20%.

  1. What is the probability that 105 or fewer people in this sample have had chickenpox in their childhood? How does this probability relate to your answer to part (b)

I wouldn’t be that surprised because 20% is a low but reasonable percentage.

Exercise 3.32

Survey response rate. Pew Research reported in 2012 that the typical response rate to their surveys is only 9%. If for a particular survey 15,000 households are contacted, what is the probability that at least 1,500 will agree to respond?

1-dbinom(1500,15000,.09)
## [1] 0.9999985
Exercise 3.33

Game of dreidel. A dreidel is a four-sided spinning top with the Hebrew letters nun, gimel, hei, and shin, one on each side. Each side is equally likely to come up in a single spin of the dreidel. Suppose you spin a dreidel three times. Calculate the probability of getting

  1. at least one nun?
18/52
## [1] 0.3461538

about 35%

  1. exactly 2 nuns?
9/52
## [1] 0.1730769

about 17%

  1. exactly 1 hei?
21/52
## [1] 0.4038462

about 40%

  1. at most 2 gimels?
51/52
## [1] 0.9807692

about 98%

Exercise 3.34

Arachnophobia. A 2005 Gallup Poll found that that 7% of teenagers (ages 13 to 17) suffer from arachnophobia and are extremely afraid of spiders. At a summer camp there are 10 teenagers sleeping in each tent. Assume that these 10 teenagers are independent of each other.

  1. Calculate the probability that at least one of them suffers from arachnophobia.
1-pbinom(1,10,.07)
## [1] 0.1517299

There is a 15% chance that at least 1 suffers from arachnaphobia.

  1. Calculate the probability that exactly 2 of them suffer from arachnophobia?
dbinom(2,10,.07)
## [1] 0.1233878

There is a 12% chance that exactly 2 of them suffer from arachnaphobia.

  1. Calculate the probability that at most 1 of them suffers from arachnophobia?
pbinom(1,10,.07)
## [1] 0.8482701

There is an 84% chance that at most one of them sufferes from arachnophobia.

  1. If the camp counselor wants to make sure no more than 1 teenager in each tent is afraid of spiders, does it seem reasonable for him to randomly assign teenagers to tents?

Yes, it makes sense, because the probability is so high (84%) that at most only 1 teenager per tent is arachnophobic.

Exercise 4.2

4.2 Identify the parameter, Part II. For each of the following situations, state whether the parameter of interest is a mean or a proportion.

  1. A poll shows that 64% of Americans personally worry a great deal about federal spending and the budget deficit.

Proportion, because worry or not worry is categorical, you can’t 60% not worried.

  1. A survey reports that local TV news has shown a 17% increase in revenue between 2009 and 2011 while newspaper revenues decreased by 6.4% during this time period.

It’s a mean, because it’s showing revenue increase or decrease over time.

  1. In a survey, high school and college students are asked whether or not they use geolocation services on their smart phones.

Proportion, because yes or no is categorical.

  1. In a survey, internet users are asked whether or not they purchased any Groupon coupons.

Same as C.

  1. In a survey, internet users are asked how many Groupon coupons they purchased over the last year.

Mean, because it is a number over time, and not categorical.

Exercise 4.3

College credits. A college counselor is interested in estimating how many credits a student typically enrolls in each semester. The counselor decides to randomly sample 100 students by using the registrar’s database of students. The histogram below shows the distribution of the number of credits taken by these students. Sample statistics for this distribution are also provided.

  1. What is the point estimate for the average number of credits taken per semester by students at this college? What about the median?

  2. What is the point estimate for the standard deviation of the number of credits taken per semester by students at this college? What about the IQR?

  3. Is a load of 16 credits unusually high for this college? What about 18 credits? Explain your reasoning. Hint: Observations farther than two standard deviations from the mean are usually considered to be unusual.

(16-13.65)/1.91
## [1] 1.230366

The load of 16 units would not be unusual, but 18 units would be unusual.

(18-13.65)/1.91
## [1] 2.277487
  1. The college counselor takes another random sample of 100 students and this time finds a sample mean of 14.02 units. Should she be surprised that this sample statistic is slightly different than the one from the original sample? Explain your reasoning.

No there is natural variability in the sample statistic. We would be more surprised if it was the same.

  1. The sample means given above are point estimates for the mean number of credits taken by all students at that college. What measures do we use to quantify the variability of this estimate?
    Compute this quantity using the data from the original sample.
Exercise 4.4

Heights of adults. Researchers studying anthropometry collected body girth measurements and skeletal diameter measurements, as well as age, weight, height and gender, for 507 physically active individuals. The histogram below shows the sample distribution of heights in centimeters.

  1. What is the point estimate for the average height of active individuals? What about the median? 171.1, 170.3

  2. What is the point estimate for the standard deviation of the heights of active individuals? What about the IQR? 9.4, 14

  3. Is a person who is 1m 80cm (180 cm) tall considered unusually tall? And is a person who is 1m 55cm (155cm) considered unusually short? Explain your reasoning.

(180-171.1)/9.4
## [1] 0.9468085
(155-171.1)/9.4
## [1] -1.712766

180 tall would not be unusually tall. 155 cm tall would not be considered unusually short, but maybe a little short.

  1. The researchers take another random sample of physically active individuals. Would you expect the mean and the standard deviation of this new sample to be the ones given above. Explain your reasoning.

No, I would not expect another sample to be the same. Because it is a small number taken from the population it could be any number of different combinations.

  1. The samples means obtained are point estimates for the mean height of all active individuals, if the sample of individuals is equivalent to a simple random sample. What measure do we use to quantify the variability of such an estimate? Compute this quantity using the data from the original sample under the condition that the data are a simple random sample.

We would use the Standard Error for the sample mean to quantify the variability of the estimate.

(171.1)/sqrt(507)
## [1] 7.598818

The Standard Error is about 7.6

Exercise 4.5

Wireless routers. John is shopping for wireless routers and is overwhelmed by the number of available options. In order to get a feel for the average price, he takes a random sample of 75 routers and finds that the average price for this sample is $75 and the standard deviation is $25.

  1. Based on this information, how much variability should he expect to see in the mean prices of repeated samples, each containing 75 randomly selected wireless routers?
25/sqrt(75)
## [1] 2.886751
  1. A consumer website claims that the average price of routers is $80. Is a true average of $80 consistent with John’s sample?
(75-80)/(25/sqrt(75))
## [1] -1.732051

No, it does not appear that John’s estimate would be unusual.

Exercise 4.6

Chocolate chip cookies. Students are asked to count the number of chocolate chips in 22 cookies for a class activity. They found that the cookies on average had 14.77 chocolate chips with a standard deviation of 4.37 chocolate chips.

  1. Based on this information, about how much variability should they expect to see in the mean number of chocolate chips in random samples of 22 chocolate chip cookies?
4.37/sqrt(22)
## [1] 0.9316871

.93 or about 1 chocolate chip.

  1. The packaging for these cookies claims that there are at least 20 chocolate chips per cookie. One student thinks this number is unreasonably high since the average they found is much lower. Another student claims the difference might be due to chance. What do you think?
(14.77-20)/(4.37/sqrt(22))
## [1] -5.613472

It is over 5 chocolate chips below the standard deviation, so yes, it is unreasonably high.