rm(list = ls()) # Clear all files from your environment
gc() # Clear unused memory
## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
## Ncells 526117 28.1 1168481 62.5 NA 669420 35.8
## Vcells 974036 7.5 8388608 64.0 16384 1851931 14.2
cat("\f") # Clear the console
graphics.off() # Clear all graphs
Using traditional methods, it takes 109 hours to receive a basic driving license. A newlicense training method using Computer Aided Instruction (CAI) has been proposed. A researcher used the technique with 190 students and observed that they had a mean of 110 hours. Assume the standard deviation is known to be 6. A level of significance of 0.05 will be used to determine if the technique performs differently than the traditionalmethod. Make a decision to reject or fail to reject the null hypothesis. Show all work in R
Null Hypothesis: it takes 109 hours to receive a basic driving license
Alternative Hypothesis: it does not take 109 hours to receive a basic driving license
Two tailed test: can be lower or higher than 109
Sample greater than 30 so we can use Z test
_______________________________
Given
Population mean: 109
Sample size: 190
Sample mean: 110
Standard deviation: 6
Level of signficance: 0.05
p_mean <- 109 # Population mean
s_mean <- 110 # Sample mean, X bar
s_size <- 190 # Sample size
sd <- 6 # Standard deviation
alpha <- 0.05 # level of significance (Alpha)
# Calculate critical Z-values for a two-tailed test
critical_Z <- qnorm(c(alpha/2,
1 - alpha/2))
critical_Z
## [1] -1.959964 1.959964
# We can use normal distribution and get Z score
z_value <- (s_mean - p_mean) / (sd / sqrt(s_size))
print(z_value)
## [1] 2.297341
#Reject or accept null, If Z < lower critical value or Z > upper critical value
if (z_value < critical_Z[1] |
z_value > critical_Z[2]) {
print("Reject the null hypothesis")} else {
print("Fail to reject null hypothesis")}
## [1] "Reject the null hypothesis"
Our environment is very sensitive to the amount of ozone in the upper atmosphere. The level of ozone normally found is 5.3 parts/million (ppm). A researcher believes that the current ozone level is at an insufficient level. The mean of 5 samples is 5.0 ppm with a standard deviation of 1.1. Does the data support the claim at the 0.05 level? Assume the population distribution is approximately normal.
Null Hypothesis: The level of ozone is 5.3 parts/million (ppm)
Alternative Hypothesis: The level of ozone is below 5.3 parts/million (ppm)
One tailed test: Has to be lower than 5.3 to reject null, testing left side
Sample less than 30 and we dont have population SD so we have to use student T
_______________________________
Given
Population mean: 5.3
Sample size: 5
Sample mean: 5.0
Sample Standard deviation: 1.1
Level of signficance: 0.05
Degrees of Freedom: 4
p_mean2 <- 5.3 # Population mean
s_mean2 <- 5 # Sample mean, X bar
s_size2 <- 5 # Sample size
s_sd2 <- 1.1 # Standard deviation
alpha2 <- 0.05 # Level of significance (Alpha)
dof2 <- s_size2 - 1 # Degrees of Freedom
# Calculate critical t-value for a one-tailed test on the left of the curve
critical_t2 <- qt(alpha2, dof2)
print(critical_t2)
## [1] -2.131847
# Calculate t-Value
t_value2 <- (s_mean2 - p_mean2) / (s_sd2 / sqrt(s_size2))
print(t_value2)
## [1] -0.6098367
# Use the T value to get the P Value
p_value2 <- pt(t_value2, dof2)
print(p_value2)
## [1] 0.2874568
# Reject or accept null hypothesis based on the results
if (t_value2 < critical_t2) {
cat("Reject the null hypothesis")
} else {
cat("Fail to reject the null hypothesis")}
## Fail to reject the null hypothesis
if (p_value2 < alpha2) {
cat("Reject the null hypothesis")
} else {
cat(" Fail to reject the null hypothesis")}
## Fail to reject the null hypothesis
Critical T value (-2.13) was less than our T value (-0.61)
Our P Value (0.29) was greater than the Alpha (0.05)
Extension: Our environment is very sensitive to the amount of ozone
in the upper atmosphere. The level of ozone normally found is 5.3
parts/million (ppm). A researcher believes that the current ozone level
is not 5.3 parts/million (ppm). The mean of 5 samples is 5.0 parts per
million (ppm) with a standard deviation of 1.1. Does the data support
the claim at the 0.05 level? Assume the population distribution is
approximately
normal.
Null Hypothesis: The level of ozone is 5.3 parts/million (ppm)
Alternative Hypothesis: The level of ozone is not 5.3 parts/million (ppm)
Two tailed test as it can be lower or higher than 5.3
# Calculate critical t-value for a two-tailed test
critical_t_2 <- c(qt(alpha2 / 2, dof2),
qt(1-alpha2 / 2, dof2))
print(critical_t_2)
## [1] -2.776445 2.776445
# Calculate t-Value
t_value_2 <- (s_mean2 - p_mean2) / (s_sd2 / sqrt(s_size2))
print(t_value2)
## [1] -0.6098367
# Use the T value to get the P Value
p_value_2 <- 2 * pt(-abs(t_value_2), dof2)
print(p_value_2)
## [1] 0.5749137
# Reject or accept null hypothesis based on the results
if (t_value_2 < critical_t_2[1] |
t_value_2 > critical_t_2[2]) {
print("Reject the null hypothesis")
} else {
print("Fail to reject the null hypothesis")}
## [1] "Fail to reject the null hypothesis"
if (p_value_2 < alpha2) {
cat("Reject the null hypothesis")
} else {
cat("Fail to reject the null hypothesis")}
## Fail to reject the null hypothesis
Our T value (-0.61) is between the two critical T values (-2.78 and 2.78)
Our P Value (0.57) was greater than the Alpha (0.05)
Our environment is very sensitive to the amount of ozone in the upper atmosphere. The level of ozone normally found is 7.3 parts/million (ppm). A researcher believes that the current ozone level is not at a normal level. The mean of 51 samples is 7.1 ppm with a variance of 0.49. Assume the population is normally distributed. A level of significance of 0.01 will be used. Show all work and hypothesis testing steps
Null Hypothesis: The level of ozone is 7.3 parts/million (ppm)
Alternative Hypothesis: The level of ozone is not 7.3 parts/million (ppm)
Two tailed test as it can be lower or higher than 7.3
Sample greater than 30 but we donβt have population SD so we have to use student T
_______________________________
Given
Population mean: 7.3
Sample size: 51
Sample mean: 7.1
Sample Standard Deviation: Sqrt(0.49)
Level of significance: 0.01
Degrees of Freedom: 50
p_mean3 <- 7.3 # Population mean
s_mean3 <- 7.1 # Sample mean, X bar
s_size3 <- 51 # Sample size
s_sd3 <- sqrt(0.49) # Standard deviation
alpha3 <- 0.01 # Level of significance (Alpha)
dof3 <- s_size3 - 1 # Degrees of Freedom
# Calculate critical t-value for a two-tailed test
critical_t3 <- c(qt(alpha3 / 2, dof3),
qt(1-alpha3 / 2, dof3))
print(critical_t3)
## [1] -2.677793 2.677793
# Calculate t-Value
t_value3 <- (s_mean3 - p_mean3) / (s_sd3 / sqrt(s_size3))
print(t_value3)
## [1] -2.040408
# Use the T value to get the P Value
p_value3 <- 2 * pt(-abs(t_value3), dof3)
print(p_value3)
## [1] 0.04660827
# Reject or accept null hypothesis based on the results
if (t_value3 < critical_t3[1] |
t_value3 > critical_t3[2]) {
print("Reject the null hypothesis")
} else {
print("Fail to reject the null hypothesis")}
## [1] "Fail to reject the null hypothesis"
if (p_value3 < alpha3) {
cat("Reject the null hypothesis")
} else {
cat(" Fail to reject the null hypothesis")}
## Fail to reject the null hypothesis
Our T value (-2.04) is between the two critical T values (-2.68 and 2.68)
Our P Value (0.05) was greater than the Alpha (0.01)
publisher reports that 36% of their readers own a laptop. A marketing executive wants to test the claim that the percentage is actually less than the reported percentage. A random sample of 100 found that 29% of the readers owned a laptop. Is there sufficient evidence at the 0.02 level to support the executiveβs claim? Show all work and hypothesis testing steps.
Null Hypothesis: 36% of readers own a laptop
Alternative Hypothesis: Less than 36% of readers own a laptop
One tailed test as they are testing is less than 36% own a laptop
Sample greater than 30 so we can use Z test, no Standard deviation given
_______________________________
Given
Population mean: 0.36
Sample size: 100
Sample mean: 0.29
Level of significance: 0.02
p_mean4 <- 0.36 # Population mean
s_mean4 <- 0.29 # Sample mean, X bar
s_size4 <- 100 # Sample size
alpha4 <- 0.02 # Level of significance (Alpha)
# Calculate critical Z-value for a one-tailed test
critical_z4 <- qnorm(alpha4)
print(critical_z4)
## [1] -2.053749
# Standard Error
Stand_error4 <- sqrt((p_mean4 * (1 - p_mean4)) / s_size4)
# Calculate Z-Value, use standard error above
z_value4 <- (s_mean4 - p_mean4) / Stand_error4
print(z_value4)
## [1] -1.458333
# Reject or accept null hypothesis based on the results
if (z_value4 < critical_z4) {
cat("Reject the null hypothesis")
} else {
cat("Fail to reject the null hypothesis")}
## Fail to reject the null hypothesis
A hospital director is told that 31% of the treated patients are uninsured. The director wants to test the claim that the percentage of uninsured patients is less than the expected percentage. A sample of 380 patients found that 95 were uninsured. Make the decision to reject or fail to reject the null hypothesis at the 0.05 level. Show all work and hypothesis testing steps.
Null Hypothesis: 31% of the treated patients are uninsured
Alternative Hypothesis: Less than 31% of the treated patients are uninsured
One tailed test as they are testing is less than 31% of the treated patients are uninsured
Sample greater than 30 so we can use Z test, no Standard deviation given, calculate sample mean
_______________________________
Given
Population mean: 0.31
Sample size: 380
Sample mean: 95/380 = 0.25
Level of significance: 0.05
p_mean5 <- 0.31 # Population mean
s_mean5 <- 95/380 # Sample mean, X bar
s_size5 <- 380 # Sample size
alpha5 <- 0.05 # Level of significance (Alpha)
# Calculate critical Z-value for a one-tailed test
critical_z5 <- qnorm(alpha5)
print(critical_z5)
## [1] -1.644854
# Standard Error
Stand_error5 <- sqrt((p_mean5 * (1 - p_mean5)) / s_size5)
# Calculate Z-Value, use standard error above
z_value5 <- (s_mean5 - p_mean5) / Stand_error5
print(z_value5)
## [1] -2.528935
# Reject or accept null hypothesis based on the results
if (z_value5 < critical_z5) {
cat("Reject the null hypothesis")
} else {
cat("Fail to reject the null hypothesis")}
## Reject the null hypothesis
Null Hypothesis: The Standard deviation has remained the same at 24
Alternative Hypothesis: The Standard deviation has decreased below 24
One tailed test as they are testing if the standard deviation has decreased below 24
Sample less than than 30 so we have to use student T
_______________________________
Given
Population mean: 112
Population Standard Deviation: 24
Sample size: 22
Sample mean: 102
Sample Standard Deviation: 15.4387
Level of significance: 0.01
p_mean6 <- 112 # Population mean
p_sd6 <- 24 # Population Standard deviation
s_mean6 <- 102 # Sample mean, X bar
s_size6 <- 22 # Sample size
s_sd6 <- 15.4387 # Sample Standard deviation
alpha6 <- 0.1 # Level of significance (Alpha)
dof6 <- s_size6 - 1 # Degrees of Freedom
# Crital Value using Chi Square
critical_X6 <- qchisq(alpha6, dof6)
print(critical_X6)
## [1] 13.2396
# Calculate the test statistic X, convert SD to Variances
X6 <- (dof6*s_sd6^2)/p_sd6^2
print(X6)
## [1] 8.68997
# Check for significance
if (X6 > critical_X6) {
cat("Reject the null hypothesis")
} else {
cat("Fail to reject the null hypothesis")}
## Fail to reject the null hypothesis
A medical researcher wants to compare the pulse rates of smokers and non-smokers. He believes that the pulse rate for smokers and non-smokers is different and wants to test this claim at the 0.1 level of significance. The researcher checks 32 smokers and finds that they have a mean pulse rate of 87, and 31 non-smokers have a mean pulse rate of 84. The standard deviation of the pulse rates is found to be 9 for smokers and 10 for non-smokers. Let π1 be the true mean pulse rate for smokers and π2 be the true mean pulse rate for non- smokers. Show all work and hypothesis testing steps
Null Hypothesis: Smokers and Non Smokers have the same average pulse
Alternative Hypothesis: Smokers and Non Smokers have a different average pulse
Two tailed test as they are testing if the standard deviation has decreased below 24
We can use a 2 sample independent T test
_______________________________
Given
Smokers size: 32
Smokers mean: 87
Smoker Standard Deviation: 9
Non Smokers size: 31
Non Smoker mean: 84
Non Smoker Standard Deviation: 10
Level of significance: 0.01
sm_size1 <- 32 # Smoker sample size
ns_size2 <- 31 # Non Smoker sample size
sm_mean1 <- 87 # Smoker mean
ns_mean2 <- 84 # Non Smoker mean
sm_sd1 <- 9 # Smoker Standard Deviation
ns_sd2 <- 10 # Non Smoker Standard Deviation
alpha7 <- 0.1 # Alpha
dof7 <- sm_size1 + ns_size2 - 2 # Degrees of Freedom
# Calculate the pooled standard deviation
pooled_SD <- sqrt(((sm_size1 - 1) * sm_sd1^2 + (ns_size2 - 1) * ns_sd2^2) / (sm_size1 + ns_size2 - 2))
# Calculate the standard error
SE7 <- sqrt(sm_sd1^2/sm_size1
+ ns_sd2^2/ns_size2)
# Calculate critical t-values for a two-tailed test
critical_t7 <- c(qt(alpha7 / 2, dof7),
qt(1-alpha7 / 2, dof7))
print(critical_t7)
## [1] -1.670219 1.670219
# Calculate the t-value
t_value7 <- (sm_mean1 - ns_mean2) / SE7
print(t_value7)
## [1] 1.25032
# Reject or accept null hypothesis based on the results
if (t_value7 < critical_t7[1] |
t_value7 > critical_t7[2]) {
print("Reject the null hypothesis")
} else {
print("Fail to reject the null hypothesis")}
## [1] "Fail to reject the null hypothesis"
Given two independent random samples with the following results:
π1 = 11
π₯Μ 1 = 127
π 1 = 33
π2 = 18
π₯Μ 2 = 157
π 2 = 27
Use this data to find the 95% confidence interval for the true
difference between the
population means. Assume that the population variances are not equal and
that the two
populations are normally distributed
Use this formula for Degrees of Freedom: ππ = πππ(π1 β 1, π2 β 1)
sizeA <- 11 # A sample size
sizeB <- 18 # B sample size
meanA <- 127 # A mean
meanB <- 157 # B mean
sdA <- 33 # A Standard Deviation
sdB <- 27 # B Standard Deviation
alpha8 <- 0.05 # Alpha
dof8 <- min(sizeA-1, sizeB-1) # Degrees of Freedom
# Calculate the standard error
SE8 <- sqrt((sdA^2 / sizeA) + (sdB^2 / sizeB))
print(SE8)
## [1] 11.81101
# Calculate the t-value for a 95% confidence interval
critical_t8 <- qt(alpha8 / 2, dof8)
print(critical_t8)
## [1] -2.228139
# Calculate the margin of error
margin_error8 <- critical_t8 * SE8
print(margin_error8)
## [1] -26.31657
# Calculate the confidence interval
conf_lower8 <- (meanA - meanB) - margin_error8
conf_upper8 <- (meanA - meanB) + margin_error8
print(conf_lower8)
## [1] -3.683426
print(conf_upper8)
## [1] -56.31657
Two men, A and B, who usually commute to work together decide to
conduct an
experiment to see whether one route is faster than the other. The men
feel that their
driving habits are approximately the same, so each morning for two weeks
one driver is
assigned to route I and the other to route II. The times, recorded to
the nearest minute, are
shown in the following table.
- Using this data, find the 98% confidence interval for the true mean
difference between the
average travel time for route I and the average travel time for route
II.
- Let π = (πππ’π‘π πΌ π‘πππ£ππ π‘πππ) β (πππ’π‘π πΌπΌ π‘πππ£ππ π‘πππ).
- Assume that the populations of travel times are normally distributed
for both routes. Show all
work and hypothesis testing steps.
Null Hypothesis: both routs are the same speed
Alternative Hypothesis: both routs are not the same speed
route1 <- c(32, 27, 34, 24, 31, 25, 30, 23, 27, 35)
route2 <- c(28, 28, 33, 25, 26, 29, 33, 27, 25, 33)
difference9 <- route1 - route2
size_rA <- 10
size_rB <- 10
alpha9 <- 0.02 # 98% confidence level
VarA9 <- sd(route1)^2
VarB9 <- sd(route2)^2
numdf9 = (VarA9/10 + VarB9/10)^2
dendf9 = (VarA9/10)^2 / 9 + (VarB9/10)^2 / 9
dof9 <- numdf9 / dendf9 # Degrees of Freedom
# Calculate the mean and standard deviation of the differences
mean_diff <- mean(difference9)
sd_diff <- sd(difference9)
# Calculate the standard error of the mean difference
SE9 <- sqrt((var(route1)/10)+(var(route2)/10))
print(SE9)
## [1] 1.678955
# Calculate the t-value for a 98% confidence interval
critical_t9 <- qt(alpha9 / 2, dof9)
print(critical_t9)
## [1] -2.568883
# Calculate the margin of error
margin_error9 <- critical_t9 * SE9
# Calculate the confidence interval for the mean difference
conf_lower9 <- mean_diff - margin_error9
conf_upper9 <- mean_diff + margin_error9
print(conf_lower9)
## [1] 4.413039
print(conf_upper9)
## [1] -4.213039
Null Hypothesis: The percentage of registered voters among employed workers is equal to or less than the percentage among unemployed workers
Alternative Hypothesis: The percentage of registered voters among employed workers exceeds the percentage among unemployed workers
sizeE <- 391
voteE <- 195
sizeUE <- 510
voteUE <- 193
alpha10 <- 0.05
# Calculate sample proportions
p1_Emp <- voteE / sizeE
p2_unEmp <- voteUE / sizeUE
# Calculate pooled proportion
p_pooled <- (voteE + voteUE) / (sizeE + sizeUE)
# Calculate standard error
SE10 <- sqrt(p_pooled * (1 - p_pooled) * (1 / sizeE + 1 / sizeUE))
print(SE10)
## [1] 0.03328424
# Find the critical z-value for a one-tailed test, right tailed
critical_z10 <- qnorm(1-alpha10)
print(critical_z10)
## [1] 1.644854
# Calculate the test statistic (z-score)
z_value10 <- (p1_Emp - p2_unEmp) / SE10
print(z_value10)
## [1] 3.614018
# Make the decision
if (z_value10 > critical_z10) {
cat("Reject the null hypothesis")
} else {
cat("Fail to reject the null hypothesis")}
## Reject the null hypothesis