Case 1: Testing the difference between the means of two groups when the sample size >= 30 and population standard deviations are known.
Example: A survey found that the average hotel room rate in New Orleans is $88.42 and the average room rate in Phoenix is $80.61. Assume that the data were obtained from two samples of 50 hotels each and that the standard deviations of the populations are $5.62 and $4.83, respectively. At α = 0.05, can it be concluded that there is a significant difference in the rates?
Step1 - State the hypothesis
Null Hypothesis: H0: μ1 = μ2
(No difference in the rates between New Orleans and Phoenix)
Alternate Hypothesis: H1: μ1 ≠ μ2 (The rates between New Orleans and
Phoenix are not the same)
Step2: State the test statistics and the significance
level
We will use z-test statistics and the significance level
alpha = 0.05 (Since we know the sample size > 30 and population
standard deviation)
Step 3: State the decision rule We will reject the
null hypothesis if the p-value < α = 0.05
Step 4: Compute the test statistics.
library(BSDA)
set.seed(40)
no_rates <- rnorm(50, mean=88.42) #rates from 50 random hotels in New Orleans with total mean as 88.42
ph_rates <- rnorm(50, mean=80.61) #rates from 50 random hotels in Phoenix with total mean as 80.61
no_sd <- 5.62
ph_sd <- 4.83
z.test(x=no_rates, y=ph_rates, alternative = "two.sided", mu=0, sigma.x = no_sd, sigma.y = ph_sd)
##
## Two-sample z-Test
##
## data: no_rates and ph_rates
## z = 7.3694, p-value = 1.714e-13
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5.668988 9.777001
## sample estimates:
## mean of x mean of y
## 88.44284 80.71985
Step 5: State conclusion
As p-value is less than alpha =
0.05, we reject the null hypothesis.
Hence, there is a significant
evidence that the average hotel rates between New Orleans and Phoenix
are not the same.
Case 2: Testing the difference between the means of two groups when the sample size < 30 and the sample standard deviations are known
Example: The average size of a farm in Indiana County, Pennsylvania, is 191 acres.The average size of a farm in Greene County, Pennsylvania, is 199 acres. Assume the data were obtained from two samples with standard deviations of 38 and 12 acres, respectively, and sample sizes of 8 and 10, respectively. Can it be concluded at α = 0.05 that the average size of the farms in the two counties is different? Assume the populations are normally distributed.
Step1 - State the hypothesis
Null Hypothesis: H0: μ1 = μ2
(No difference in the average size of farms between Indiana County and
Greene County)
Alternate Hypothesis: H1: μ1 ≠ μ2 (The average size
of the farms are not the same)
Step2: State the test statistics and the significance
level
We will use t-test statistics and the significance level
alpha = 0.05 (Since we know the sample size < 30 and the sample
standard deviations)
Step 3: State the decision rule
We will reject the null
hypothesis if the p-value < α = 0.05
Step 4: Compute the t-test statistics.
set.seed(40)
indiana <- rnorm(8, mean=191, sd=38) #The sample size is 8 and the mean is 191 for indiana county
greene <- rnorm(10, mean=199, sd=12) #The sample size is 10 and the mean is 199 for Greene county
stats::t.test(x=indiana, y=greene, alternative = "two.sided", mu=0)
##
## Welch Two Sample t-test
##
## data: indiana and greene
## t = -1.1568, df = 7.6071, p-value = 0.2824
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -51.93106 17.44325
## sample estimates:
## mean of x mean of y
## 181.4210 198.6649
Step 5: State conclusion
As p-value is greater than alpha
= 0.05, we cannot reject the null hypothesis.
Hence, there is no
significant evidence that the average farm size in Indiana county is
same as the Greene County.
Case 3: Testing the
difference between the means of two groups when the sample size < 30
and the sample standard deviations are known and the samples are
dependent or paired. Same as the two
sample t-test except that you also need to use the arguement paired =
TRUE.
Example: t.test(x=indiana, y=greene, alternative =
“two.sided”, paired = TRUE, mu=0)
Note - The sample size should be
the same for both the groups.
Case 4: Testing the difference between the two proportions when the sample size > 30
Example: In a nursing home study, the researchers found that 12 out of 34 small nursing homes had a resident vaccination rate of less than 80%, while 17 out of 24 large nursing homes had a vaccination rate of less than 80%. At α = 0.05, test the claim that there is no difference in the proportions of the small and large nursing homes with a resident vaccination rate of less than 80%.
Step1 - State the hypothesis
Null Hypothesis: H0: p1 = p2 (The proportions between small nursing
homes and large nursing homes are the same)
Alternate Hypothesis:
H1: p1 ≠ p2 (The proportions are not the same)
Step2: State the test statistics and the significance
level
We will use two proportion z-test statistics and the
significance level alpha = 0.05. Step 3: State the decision
rule
We will reject the null hypothesis if the p-value < α =
0.05
Step 4: Compute the z-test statistics.
n1 <- 34 #sample size of the small nursing home
x1 <- 12 #positive responses for small nursing homes
x2 <- 17 #sample size of the large nursing home
n2 <- 24 #positive responses for large nursing homes
prop.test(x=c(x1,x2), n=c(n1,n2), alternative = "two.sided", correct = FALSE)
##
## 2-sample test for equality of proportions without continuity correction
##
## data: c(x1, x2) out of c(n1, n2)
## X-squared = 7.1078, df = 1, p-value = 0.007675
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.5980250 -0.1127593
## sample estimates:
## prop 1 prop 2
## 0.3529412 0.7083333
Step 5: State conclusion
As p-value is less than alpha =
0.05, we can reject the null hypothesis.
Hence, there is significant
evidence that the proportions between small nursing homes and large
nursing homes are not the same in terms of resident vaccinations.