Distributions (Use R for all Problems. Show your Code).
These questions will help you build an understanding of Normal, Binomial, Hypergeometric and Poisson distribution.
It would be very helpful if you could plot the distributions before calculating the probabilities. Thus, begin with reading up on the plot () function.
You will be using the probability density function, cumulative density function and quantile function in this assignment.
Q1-Q3 are about Binomial and Poisson distribution. For binomial problems, we need to specify N (the number of trials) and p (the probability of success). For Poisson, we need to specify lamda, the mean number of events per interval.
1. A researcher wishes to conduct a study of the color preferences of new car buyers. Suppose that 50% of this population prefers the color red. If 20 buyers are randomly selected, what is the probability that between 9 and 12 (both inclusive) buyers would prefer red?
Round your answer to four decimal places. Use the round() function in R.
# N = 20 buyers are randomly selected
N.buyers<-20
# p = 0.5 because 50% of this population prefers red
p.red<-0.5
q1.prob<-round(sum(dbinom(9:12, size=N.buyers, prob=p.red)),4)
q1.prob
## [1] 0.6167
0.6167
If 20 buyers are randomly selected, there is a 61.67% probability that between 9 and 12 (both inclusive) buyers would prefer red.
2. A quality control inspector has drawn a sample of 13 light bulbs from a recent production lot. Suppose 20% of the bulbs in the lot are defective. What is the probability that less than 6 but more than 3 bulbs from the sample are defective?
Round your answer to four decimal places.
# N = 13 light bulbs in the sample
N.bulbs<-13
# p = 0.2 as we suppose 20% of the bulbs are defective
p.defective<-0.2
q2.prob<- round(sum(dbinom(4:5, size=N.bulbs, prob=p.defective)),4)
q2.prob
## [1] 0.2226
0.2226
There is a 22.26% probability that between 3 and 6 bulbs from the sample are defective (non-inclusive).
3. The auto parts department of an automotive dealership sends out a mean of 4.2 special orders daily. What is the probability that, for any day, the number of special orders sent out will be no more than 3?
Round your answer to four decimal places.
# lamba = 4.2 special orders sent out per day
lamda.orders<-4.2
q3.prob<-round(ppois(3,lamda.orders),4)
q3.prob
## [1] 0.3954
0.3954
There is a 39.54% probability that, for any day, the number of special orders sent out will be 3 or less.
Q4 and Q5 are about hypergeometric distribution.
Need to find:
m, number of successes in the population
n, number of failures in the population
k, number of items drawn in the sample
x, number of observed successes in the sample
4. A pharmacist receives a shipment of 17 bottles of a drug and has 3 of the bottles tested. If 6 of the 17 bottles are contaminated, what is the probability that less than 2 of the tested bottles are contaminated?
Round your answer to four decimal places.
#Successes in this are contaminations
# m = 6 (of the 17 bottles are contaminated)
m.bottles<-6
# n = 11 (17-6=11 clean bottles)
n.bottles<-11
# k = 3 bottles in the test sample
k.bottles<-3
# x = 0 or 1, the values less than 2 contaminated bottles in sample
# use phyper() to find the cumilative probability <= a value
x.bottles<-1
q4.prob<-round(phyper(1,m.bottles,n.bottles,k.bottles),4)
q4.prob
## [1] 0.7279
0.7279
There is a 72.79% probability that less than 2 of the tested bottles are contaminated.
5. A town recently dismissed 6 employees in order to meet their new budget reductions. The town had 6 employees over 50 years of age and 19 under 50. If the dismissed employees were selected at random, what is the probability that more than 1 employee was over 50?
Round your answer to four decimal places.
#Successes in this are employees over 50
# m = 6 (of the 25 employees are over 50)
m.employees<-6
# n = 19 of the employees are under 50
n.employees<-19
# k = 6 employees in the dismissed sample
k.employees<- 6
# x = 1, as we want to find 1 or more in our sample
#The probability we want is 1 minus the probability that we get 1 or less
x.employees<-1
q5.prob<-round(1-phyper(1,m.employees,n.employees,k.employees),4)
q5.prob
## [1] 0.4529
0.4549
There is a 45.29% probability that more than 1 of the dismissed employees is over 50, if they were selected at random.
Q6-Q9 are about normal distribution, which is a continuous distribution and thus easier to handle than a discrete distribution. The mean and the standard deviation must be specified.
6. The weights of steers in a herd are distributed normally. The variance is 90,000 and the mean steer weight is 800 lbs. Find the probability that the weight of a randomly selected steer is between 1040 and 1460 lbs.
Round your answer to four decimal places.
#The mean steer weight is 800 lbs
mean.steer<-800
#The standard deviation is the square root of the variance
sd.steer<-sqrt(90000)
#take the difference to find probability of falling within range
q6.prob<-round(pnorm(1460,mean.steer,sd.steer)-pnorm(1040,mean.steer,sd.steer),4)
q6.prob
## [1] 0.198
0.1980
There is a 19.80% probability that the weight of a randomly selected steer is between 1040 and 1460 lbs.
Round your answer to four decimal places.
#The mean diameter is 106 millimeters
mean.diameter<-106
#The standard deviation is 4 millimeters
sd.diameter<-4
#take the difference to find probability of falling within range
q7.prob<-round(pnorm(111,mean.diameter,sd.diameter)-pnorm(103,mean.diameter,sd.diameter),4)
q7.prob
## [1] 0.6677
0.6667
The probability that the diameter of a selected bearing is between 103 and 111 millimeters is 66.77%
8. The lengths of nails produced in a factory are normally distributed with a mean of 3.34 centimeters and a standard deviation of 0.07 centimeters. Find the two lengths that separate the top 3% and the bottom 3%. These lengths could serve as limits used to identify which nails should be rejected.
Round your answer to the nearest hundredth (2 decimal places), if necessary. You will have to use the quantile function1, qnorm() here. In fact, we have seen a little bit of quintiles already when we talked about median and boxplots.
#The mean length of nails produced is 3.34 centimeters
mean.length<-3.34
#The standard deviation is 0.07 centimeters
sd.length<-0.07
# cutoffs for bottom and top 3%
lower <- round(qnorm(0.03, mean.length, sd.length),2)
upper <- round(qnorm(0.97, mean.length, sd.length),2)
lower
## [1] 3.21
upper
## [1] 3.47
The lengths that could serve as upper and lower limits used to identify which nails should be rejected are 3.21” on the lower end and 3.47” inches on the upper end
9. A psychology professor assigns letter grades on a test according to the following scheme.
A: Top 9% of scores
B: Scores below the top 9% and above the bottom 63%
C: Scores below the top 37% and above the bottom 17%
D: Scores below the top 83% and above the bottom 8%
F: Bottom 8% of scores
Scores on the test are normally distributed with a mean of 75.8 and a standard deviation of 8.1. Find the minimum score required for an A grade.
Round your answer to the nearest whole number, if necessary.
#The mean test score is 75.8
mean.test.score<-75.8
#The standard deviation is 8.1
sd.test.score<-8.1
# cutoff for top 9%
GradeA<-round(qnorm(0.91,mean.test.score,sd.test.score))
GradeA
## [1] 87
The minimum score required for an A grade on this test is an 87.
10. Consider the probability that exactly 96 out of 155 computers will not crash in a day. Assume the probability that a given computer will not crash in a day is 61%. Approximate the (binomial) probability using the normal distribution.
Round your answer to four decimal places.
#For the binomial, we need N and p
# N = 155 computers in the sample
N.comp<-155
# probability of not crashing = 61%
p.nocrash<-0.61
#For normal, we need the mean and the sd
mean.nocrashes<- N.comp * p.nocrash
mean.nocrashes
## [1] 94.55
#For sd, take the square root of formula n*p*(1-p)
sd.q10<-sqrt(N.comp*p.nocrash*(1-p.nocrash))
#Because we are looking for exactly 96, not a range on the continuous distribution, we have to a continuity correction, taking the difference between 96.5 and 95.5
q10.prob<-pnorm(96.5,mean.nocrashes,sd.q10)-pnorm(95.5,mean.nocrashes,sd.q10)
round(q10.prob,4)
## [1] 0.0638
There is a 6.38% chance that exactly 96 out of 155 computers will not crash in a day.