Multiple Births The numbers of various multiple births in the United States for the past 10 years are listed. Find the mean, median, range, variance, and standard deviation of the data sets. Find the value that corresponds to the \(92^{th}\) percentile? Find the \(1^{st}\), \(2^{nd}\) and \(3^{rd}\) quartiles. Which set of data is the most variable?
| Triplets | quadruplets | Quintuplets |
|---|---|---|
| 5877 7110 5937 6898 6118 6885 6208 6742 6750 6742 | 345 468 369 434 355 501 418 506 439 512 | 46 85 91 69 67 85 68 77 86 67 |
triplets <- c(5877, 7110, 5937, 6898, 6118, 6885, 6208, 6742, 6750,6742)
quadruplets <- c(345, 468, 369, 434, 355, 501, 418, 506, 439,512)
quintuplets <- c(46, 85, 91, 69, 67, 85, 68, 77, 86, 67)
# Triplets Descriptive Statistics
# mean(triplets)
# median(triplets)
# range(triplets)
# var(triplets)
# sd(triplets)
# scale(triplets)
quantile(triplets, 0.92)
## 92%
## 6957.36
# quantile(triplets, c(0.25, 0.50, 0.75))
# fivenum(triplets)
# IQR(triplets)
# Quadruplets Descriptive Statistics
# mean(quadruplets)
# median(quadruplets)
# range(quadruplets)
# var(quadruplets)
# sd(quadruplets)
# scale(quadruplets)
quantile(quadruplets, 0.92)
## 92%
## 507.68
# quantile(quadruplets, c(0.25, 0.50, 0.75))
# fivenum(quadruplets)
# IQR(quadruplets)
# Quintuplets Descriptive Statistics
# mean(quintuplets)
# median(quintuplets)
# range(quintuplets)
# var(quintuplets)
# sd(quintuplets)
# scale(quintuplets)
quantile(quintuplets, 0.92)
## 92%
## 87.4
# quantile(quintuplets, c(0.25, 0.50, 0.75))
# fivenum(quintuplets)
# IQR(quintuplets)
triplets.sum <- summary(triplets)
quadruplets.sum <- summary(quadruplets)
quintuplets.sum <- summary(quintuplets)
summ <- cbind(triplets.sum, quadruplets.sum, quintuplets.sum)
summ
## triplets.sum quadruplets.sum quintuplets.sum
## Min. 5877.00 345.00 46.00
## 1st Qu. 6140.50 381.25 67.25
## Median 6742.00 436.50 73.00
## Mean 6526.70 434.70 74.10
## 3rd Qu. 6851.25 492.75 85.00
## Max. 7110.00 512.00 91.00
# Most variable Data, we need to calculate the Coefficients of Variation
triplets.cv <- sd(triplets)/mean(triplets)
quadruplets.cv <- sd(quadruplets)/mean(quadruplets)
quintuplets.cv <- sd(quintuplets)/mean(quintuplets)
cvsum <- cbind(triplets.cv, quadruplets.cv, quintuplets.cv)
cvsum
## triplets.cv quadruplets.cv quintuplets.cv
## [1,] 0.06828257 0.1446333 0.1814433
Favorite Coffee Flavor A survey was taken asking the favorite flavor of a coffee drink a person prefers. The responses were V = Vanilla, C = Caramel, M = Mocha, H = Hazelnut, and P = Plain. Construct a categorical frequency distribution for the data. Which class has the most data values and which class has the fewest data values?
coffee <- c("V","C", "P", "P", "M", "M", "P", "P", "M", "C",
"M", "M", "V", "M", "M", "M", "V", "M", "M", "M",
"P", "V", "C", "M", "V", "M", "C", "P", "M", "P",
"M", "M", "M", "P", "M", "M", "C", "V", "M", "C",
"C", "P", "M", "P", "M", "H", "H", "P", "H", "P")
FreqTable <- transform(table(coffee))
FreqTable$RelFreq <- prop.table(FreqTable$Freq)*100
FreqTable$CumFreq <- cumsum(FreqTable$Freq)
FreqTable
## coffee Freq RelFreq CumFreq
## 1 C 7 14 7
## 2 H 3 6 10
## 3 M 22 44 32
## 4 P 12 24 44
## 5 V 6 12 50
Stories in the World’s Tallest Buildings: The number of stories in each of a sample of the world’s 30 tallest buildings follows. Construct a histogram and a box plot.
stories <- c(88,88, 110, 88, 80, 69, 102, 78, 70, 55,
79, 85, 80, 100, 60, 90, 77, 55, 75, 55,
54, 60, 75, 64, 105, 56, 71, 70, 65, 72)
stories1 <- c(88,88, 110, 88, 80, 69, 102, 78, 70, 55,
79, 85, 80, 100, 60, 90, 77, 55, 75, 55,
54, 60, 75, 64, 105, 56, 71, 70, 65, 72)
hs <- hist(stories,
xlab = "Stories per building",
ylab = "Frequency",
col = "cadetblue4",
main = "Number of Stories for 30 Tallest Buildings")
bp <- boxplot(stories, stories1,
horizontal = TRUE,
col = "blue",
xlab = "Number of stories",
main = "Number of Stories for 30 Tallest Buildings")
A motorist claims that the South Boro Police issue an average of 60 speeding tickets per day. These data show the number of speeding tickets issued each day for a randomly selected period of 30 days. Assume \(\sigma= 13.42\). Is there enough evidence to reject the motorist’s claim at \(\alpha = 0.05\)? Use the P-value method.
speeding <- c(72, 45, 36, 68, 69, 71, 57, 60,
83, 26, 60, 72, 58, 87, 48, 59,
60, 56, 64, 68, 42, 57, 57,
58, 63, 49, 73, 75, 42, 63)
sigma <- 13.42
mu <- 60
n <- length(speeding)
# Confidence Interval
x_bar <- mean(speeding)
z <- qnorm(0.05, lower.tail = FALSE)
E <- z*sigma/sqrt(n)
LL <- x_bar-E
UL <- x_bar+E
CL <- cbind(LL, UL)
CL
## LL UL
## [1,] 55.9032 63.96346
# Hypothesis Testing
z_test <- (mean(speeding)-mu)/(sigma/sqrt(n))
p_value <- 2*pnorm(abs(z_test), lower.tail = FALSE)
p_value
## [1] 0.9782928
Is there a linear relationship between the monthly average temperatures and the number of homicides committed during the month?
If so, how strong is the relationship between the average monthly temperature and the number of homicides committed?
If a relationship exists, can it be said that an increase in temperatures will cause an increase in the number of homicides occurring in that city?
month <- c("January", "February", "March", "April", "May", "June", "July", "August", "SEptember", "October", "November", "December")
temperature <- c(32,38,47,59, 70,80,84,83,76,64,49,37)
homicides <- c(32,20,35,35, 49,49,53,56,62,29,36,32)
data <- data.frame(month,temperature, homicides)
attach(data)
## The following objects are masked _by_ .GlobalEnv:
##
## homicides, month, temperature
cor(temperature, homicides)
## [1] 0.8357474
cor.test(temperature, homicides)
##
## Pearson's product-moment correlation
##
## data: temperature and homicides
## t = 4.813, df = 10, p-value = 0.0007097
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.5031981 0.9526994
## sample estimates:
## cor
## 0.8357474
model <- lm(homicides ~ temperature)
plot(temperature, homicides)
abline(model)
A vending machine automatically pours soft drinks into cups. The amount of soft drink dispensed into a cup is normally distributed with mean of 7.6 oz and standard deviation of 0.4 oz.
What is the probability that the machine will overflow an 8 oz cup?
What is the probability that the amount dispensed by the machine is between 7.2 and 8.0 oz?
What is the probability the average amount dispensed in a random sample of 9 cups is less than 7.4 oz?
Use normal approximation to binomial to find the probability that in a random sample of 19 cups 5 or more will not overflow an 8 oz cup?
Let X=Soft drink dispensed
\[\mu=7.6,~~ \sigma=0.4\] (a) What is the probability that the machine will overflow an 8 oz cup?
By Hand:
\[P(X>8)=1-P(X<8)=1-P\left(Z<\frac{8-7.6}{0.4}\right)=1-P(Z<1)=0.1586\]
By R:
a <- pnorm(8, 7.6, 0.4, lower.tail = FALSE)
cat(" the probability that the machine will overflow an 8 oz cup is: ", a)
## the probability that the machine will overflow an 8 oz cup is: 0.1586553
By Hand:
\[P(7.2<X<8.0)=P(X<8)-P(X<7.2)=P\left(Z<\frac{8-7.6}{0.4}\right)-P\left(Z<\frac{7.2-7.6}{0.4}\right)=P(Z<1)-P(Z<-1)=0.6827\]
By R:
b <- pnorm(8, 7.6, 0.4, lower.tail = TRUE)-pnorm(7.2, 7.6, 0.4, lower.tail = TRUE)
cat(" the probability that the amount dispensed by the machine is between 7.2 and 8.0 oz is: ", b)
## the probability that the amount dispensed by the machine is between 7.2 and 8.0 oz is: 0.6826895
By Hand:
\[n=9,~~ \sigma_{\bar{x}}=\frac{\sigma}{\sqrt{n}}=0.1333 \longrightarrow P(\bar{X}<7.4) = P(Z<-1.5) = 0.0668\] By R:
c <- pnorm(7.4, 7.6, 0.1333, lower.tail = TRUE)
cat(" the probability the average amount dispensed in a random sample of 9 cups is less than 7.4 oz is: ", c)
## the probability the average amount dispensed in a random sample of 9 cups is less than 7.4 oz is: 0.06675863
By Hand:
\[n=19, p=0.8414,~~ \longrightarrow \mu=np=15.9866, ~~ \sigma = \sqrt{npq}=1.5923\] \[P(X\ge5)=1-P\left(Z < \frac{5-15.9866}{1.5923}\right)=1-P(Z<-6.9)=1\] By R:
d <- pnorm(5, 15.9866, 1.5923, lower.tail = FALSE)
cat(" the probability that in a random sample of 19 cups 5 or more will not overflow an 8 oz cup is: ", d)
## the probability that in a random sample of 19 cups 5 or more will not overflow an 8 oz cup is: 1