n <- 20 #Sample size
p <- 0.5 #Probability of red preference
kseq <- (0:n)
barplot(dbinom(kseq, n, p), names.arg = kseq, ylab = "Probability", xlab = "# of Buyers Preferring Red", cex.names = 0.8)
Due to the small sample size, and relatively high rate of red preference (which we will treat like a success), the Binomial distribution will be more appropriate than the Poisson distribution. We can find the probability of 9-12 buyers preferring red by taking the CDF of k = 12 and k = 9 individually, then subtracting them from each other.
k1 <- 8 #Since the range is inclusive, we enter 8 instead of 9.
k2 <- 12
ck1 <- pbinom(k1, n, p) #CDF with 9 buyers preferring red
ck2 <- pbinom(k2, n, p) #CDF with 12 buyers preferring red
round(ck2-ck1, 4)
## [1] 0.6167
The probability that between 9 and 12 (both inclusive) buyers would prefer red is 61.67%. We can double check the math by adding up the PDF results for k = 9, 10, 11, and 12.
round(dbinom(9, n, p) + dbinom(10, n, p) + dbinom(11, n, p) + dbinom(12, n, p),4)
## [1] 0.6167
n <- 13 #Sample size
p <- 0.2 #Probability of defect
kseq <- (0:n)
barplot(dbinom(kseq, n, p), names.arg = kseq, ylab = "Probability", xlab = "# of Defective Bulbs", cex.names = 0.8)
Again, due to the small sample size and relatively high rate of failure, we will use the Binomial distribution in a way very similar to the last problem.
k1 <- 3
k2 <- 5 #Since range is exclusive, we enter 5 instead of 6
ck1 <- pbinom(k1, n, p) #CDF with 3 defective bulbs
ck2 <- pbinom(k2, n, p) #CDF with 6 defective bulbs
round(ck2-ck1, 4)
## [1] 0.2226
The probability of there being more than 3 but less than 6 defective bulbs is 22.26%. Again we will check the result by adding up the appropriate PDF results.
round(dbinom(4, n, p) + dbinom(5, n, p), 4)
## [1] 0.2226
lambda <- 4.2 #rate of occurrence
kseq <- (0:15)
barplot(dpois(kseq, lambda), names.arg = kseq, ylab = "Probability", xlab = "# of Special Orders", cex.names = 0.8)
Since we are just given a rate of occurrence in absolute terms, and no population size, our best bet is to use the Poisson distribution here.
k <- 3 #Since we are just looking for the area to the left of k = 3, we only need this one point.
ck <- ppois(k, lambda) #CDF with 3 or less special orders
round(ck, 4)
## [1] 0.3954
The probability of there being no more than 3 sepcial orders per day is 39.54%. We will again double check the results by adding up the relevant PDF results.
round(dpois(0, lambda) + dpois(1, lambda) + dpois(2, lambda) + dpois(3, lambda), 4)
## [1] 0.3954
contam <- 6
clean <- 11
sample <- 3
kseq <- (0:3)
barplot(dhyper(kseq, contam, clean, sample), names.arg = kseq, ylab = "Probability", xlab = "# of Contaminated Bottles", cex.names = 0.8)
Here we will have to use the hypergeometric distribution, as we will be drawing from a finite population without replacement.
k <- 1 #Since we are looking for 2 or less, we will take the CDF of k = 1
ck <- phyper(k, contam, clean, sample)
round(ck, 4)
## [1] 0.7279
The probability of the sample of 3 has less than 2 contaminated bottles is 72.79%. Double checking the result by adding up the PDF results…
round(dhyper(0, contam, clean, sample) + dhyper(1, contam, clean, sample), 4)
## [1] 0.7279
over <- 6
under <- 19
sample <- 6
kseq <- (0:6)
barplot(dhyper(kseq, over, under, sample), names.arg = kseq, ylab = "Probability", xlab = "# of Employees Over 50", cex.names = 0.8)
Again we are sampling without replacement, so we will use the hypergeometic distrobution.
k <- 1 #find the area to the left of k = 1
ck <- phyper(k, over, under, sample)
round(1 - ck, 4) #To get the area to the right, subtract our earlier result from 1
## [1] 0.4529
The probability of more than 1 of the 6 fired employee being over 50 is 45.29%. Double checking our results by adding up the PDF results gives us the same result.
round(dhyper(2, over, under, sample) + dhyper(3, over, under, sample) + dhyper(4, over, under, sample)
+ dhyper(5, over, under, sample) + dhyper(6, over, under, sample), 4)
## [1] 0.4529
mean <- 800
variance <- 90000
sd <- sqrt(variance)
curve(dnorm(x, mean = mean, sd = sd), from = 0, to = 1600, xlab = "Weight (lbs)", ylab = "Probability")
Since we are now working with a continuous curve as opposed to discrete “buckets” it is a bit simpler to compute the cumulative distribution functions needed.
k1 <- 1040 #lower bound
k2 <- 1460 #upper bound
ck1 <- pnorm(k1, mean = mean, sd = sd) #take CDF function up to lower bound
ck2 <- pnorm(k2, mean = mean, sd = sd) #take CDF function up to upper bound
round(ck2-ck1, 4) #find the area between
## [1] 0.198
The probability that the weight of a random steer is between 1040 and 1460 lbs is about 19.80%.
mean <- 106
sd <- 4
curve(dnorm(x, mean = mean, sd = sd), from = 90, to = 122, xlab = "Diameter (mm)", ylab = "Probability")
Again we will be using a normal distribution:
k1 <- 103 #lower bound
k2 <- 111 #upper bound
ck1 <- pnorm(k1, mean = mean, sd = sd) #take CDF function up to lower bound
ck2 <- pnorm(k2, mean = mean, sd = sd) #take CDF function up to upper bound
round(ck2-ck1, 4) #find the area between
## [1] 0.6677
The probability that the diameter of a randomly selected bearing is between 103 and 111 mm is about 66.77%
mean <- 3.34
sd <- 0.07
curve(dnorm(x, mean = mean, sd = sd), from = 3, to = 3.7, xlab = "Length (cm)", ylab = "Probability")
Next we can use the normal quantile function to see what values mark the 3rd and 97th percentile of length.
round(qnorm(0.03, mean = mean, sd = sd), 2) #find the value below which, 3% of nails fall
## [1] 3.21
round(qnorm(0.97, mean = mean, sd = sd), 2) #find the value below which, 97% of nails fall
## [1] 3.47
mean <- 75.8
sd <- 8.1
curve(dnorm(x, mean = mean, sd = sd), from = 0, to = 100, xlab = "Test Score", ylab = "Probability")
If one must be in the top 9% of scores to get an A, then we can find the minimum score required for an A by identifying the 91st percentile. If you are above this number, you are in the top 9% and will get an A. We can easily find it using the quantile function from the last problem.
round(qnorm(0.91, mean = mean, sd = sd), 4) #find the value above which is the top 9% of scores
## [1] 86.6601
If one scores and 86.66 (87 with rounding) or higher on the test, they should get an A.
n <- 155 #sample size
p <- 0.61 #probability of not crashing
#Calculate parameters for normal distribution
mean <- n*p
sd <- sqrt(n*p*(1-p))
kseq <- (75:115)
plot(kseq, dbinom(kseq, n, p), type = "h", lwd = 5, ylab = "Probability", xlab = "# of Non-Crashing Computers",
main = "Normal Approximation of Binonmial Distribution")
lines(kseq, dnorm(kseq, mean = mean, sd = sd), lwd = 2)
With the sample size large enough, we are able to perform a reasonable normal approximation of the binomial distribution. Below we calculate the probability of exactly 96 out 155 computers not crashing using both the normal and binomial density functions.
round(dnorm(96, mean = mean, sd = sd), 4)
## [1] 0.0639
round(dbinom(96, n, p), 4)
## [1] 0.064
The 2 results are nearly identical with only a difference of 0.01%