rm(list = ls())      # Clear all files from your environment
         gc()            # Clear unused memory
##          used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
## Ncells 526117 28.1    1168481 62.5         NA   669420 35.8
## Vcells 974036  7.5    8388608 64.0      16384  1851931 14.2
         cat("\f")       # Clear the console
 graphics.off()      # Clear all graphs

Part 1

Using traditional methods, it takes 109 hours to receive a basic driving license. A newlicense training method using Computer Aided Instruction (CAI) has been proposed. A researcher used the technique with 190 students and observed that they had a mean of 110 hours. Assume the standard deviation is known to be 6. A level of significance of 0.05 will be used to determine if the technique performs differently than the traditionalmethod. Make a decision to reject or fail to reject the null hypothesis. Show all work in R

Null Hypothesis: it takes 109 hours to receive a basic driving license

Alternative Hypothesis: it does not take 109 hours to receive a basic driving license

Two tailed test: can be lower or higher than 109

Sample greater than 30 so we can use Z test

_______________________________

Given

Population mean: 109

Sample size: 190

Sample mean: 110

Standard deviation: 6

Level of signficance: 0.05

p_mean <- 109 # Population mean
s_mean <- 110 # Sample mean, X bar
s_size <- 190 # Sample size
sd <- 6       # Standard deviation
alpha <- 0.05    # level of significance (Alpha)

# Calculate critical Z-values for a two-tailed test
critical_Z <- qnorm(c(alpha/2, 
                      1 - alpha/2))

critical_Z
## [1] -1.959964  1.959964
# We can use normal distribution and get Z score
z_value <- (s_mean - p_mean) / (sd / sqrt(s_size))

print(z_value)
## [1] 2.297341
#Reject or accept null, If Z < lower critical value or Z > upper critical value
if (z_value < critical_Z[1] | 
    z_value > critical_Z[2]) {
  print("Reject the null hypothesis")} else {
  print("Fail to reject null hypothesis")}
## [1] "Reject the null hypothesis"

Here we can see we Reject the Null Hypothesis with 95% confidence level:

  • Our Z value (2.30) is greater than our critical value (1.96) and thus not within the inner range

Part 2: Section 1

Our environment is very sensitive to the amount of ozone in the upper atmosphere. The level of ozone normally found is 5.3 parts/million (ppm). A researcher believes that the current ozone level is at an insufficient level. The mean of 5 samples is 5.0 ppm with a standard deviation of 1.1. Does the data support the claim at the 0.05 level? Assume the population distribution is approximately normal.

Null Hypothesis: The level of ozone is 5.3 parts/million (ppm)

Alternative Hypothesis: The level of ozone is below 5.3 parts/million (ppm)

One tailed test: Has to be lower than 5.3 to reject null, testing left side

Sample less than 30 and we dont have population SD so we have to use student T

_______________________________

Given

Population mean: 5.3

Sample size: 5

Sample mean: 5.0

Sample Standard deviation: 1.1

Level of signficance: 0.05

Degrees of Freedom: 4

p_mean2 <- 5.3       # Population mean
s_mean2 <- 5         # Sample mean, X bar
s_size2 <- 5         # Sample size
s_sd2 <- 1.1         # Standard deviation
alpha2 <- 0.05       # Level of significance (Alpha)
dof2 <- s_size2 - 1  # Degrees of Freedom

# Calculate critical t-value for a one-tailed test on the left of the curve
critical_t2 <- qt(alpha2, dof2)

print(critical_t2)
## [1] -2.131847
# Calculate t-Value
t_value2 <- (s_mean2 - p_mean2) / (s_sd2 / sqrt(s_size2))
print(t_value2)
## [1] -0.6098367
# Use the T value to get the P Value
p_value2 <- pt(t_value2, dof2)

print(p_value2)
## [1] 0.2874568
# Reject or accept null hypothesis based on the results
if (t_value2 < critical_t2) {
  cat("Reject the null hypothesis")
} else {
  cat("Fail to reject the null hypothesis")}
## Fail to reject the null hypothesis
if (p_value2 < alpha2) {
    cat("Reject the null hypothesis")
  } else {
    cat(" Fail to reject the null hypothesis")}
##  Fail to reject the null hypothesis

Here we can see we fail to reject the Null Hypothesis with a 95% confidence level:

  • Critical T value (-2.13) was less than our T value (-0.61)

  • Our P Value (0.29) was greater than the Alpha (0.05)

Part 2: Section 2

Extension: Our environment is very sensitive to the amount of ozone in the upper atmosphere. The level of ozone normally found is 5.3 parts/million (ppm). A researcher believes that the current ozone level is not 5.3 parts/million (ppm). The mean of 5 samples is 5.0 parts per million (ppm) with a standard deviation of 1.1. Does the data support the claim at the 0.05 level? Assume the population distribution is approximately
normal.

Null Hypothesis: The level of ozone is 5.3 parts/million (ppm)

Alternative Hypothesis: The level of ozone is not 5.3 parts/million (ppm)

Two tailed test as it can be lower or higher than 5.3

# Calculate critical t-value for a two-tailed test
critical_t_2 <- c(qt(alpha2 / 2, dof2),
                  qt(1-alpha2 / 2, dof2))

print(critical_t_2)
## [1] -2.776445  2.776445
# Calculate t-Value
t_value_2 <- (s_mean2 - p_mean2) / (s_sd2 / sqrt(s_size2))
print(t_value2)
## [1] -0.6098367
# Use the T value to get the P Value
p_value_2 <- 2 * pt(-abs(t_value_2), dof2)

print(p_value_2)
## [1] 0.5749137
# Reject or accept null hypothesis based on the results
if (t_value_2 < critical_t_2[1] | 
    t_value_2 > critical_t_2[2]) {
  print("Reject the null hypothesis")
} else {
  print("Fail to reject the null hypothesis")}
## [1] "Fail to reject the null hypothesis"
if (p_value_2 < alpha2) {
    cat("Reject the null hypothesis")
  } else {
    cat("Fail to reject the null hypothesis")}
## Fail to reject the null hypothesis

Here we can see we fail to reject the Null Hypothesis with a 95% confidence level:

  • Our T value (-0.61) is between the two critical T values (-2.78 and 2.78)

  • Our P Value (0.57) was greater than the Alpha (0.05)

Part 3:

Our environment is very sensitive to the amount of ozone in the upper atmosphere. The level of ozone normally found is 7.3 parts/million (ppm). A researcher believes that the current ozone level is not at a normal level. The mean of 51 samples is 7.1 ppm with a variance of 0.49. Assume the population is normally distributed. A level of significance of 0.01 will be used. Show all work and hypothesis testing steps

Null Hypothesis: The level of ozone is 7.3 parts/million (ppm)

Alternative Hypothesis: The level of ozone is not 7.3 parts/million (ppm)

Two tailed test as it can be lower or higher than 7.3

Sample greater than 30 but we don’t have population SD so we have to use student T

_______________________________

Given

Population mean: 7.3

Sample size: 51

Sample mean: 7.1

Sample Standard Deviation: Sqrt(0.49)

Level of significance: 0.01

Degrees of Freedom: 50

p_mean3 <- 7.3        # Population mean
s_mean3 <- 7.1        # Sample mean, X bar
s_size3 <- 51         # Sample size
s_sd3 <- sqrt(0.49)   # Standard deviation
alpha3 <- 0.01        # Level of significance (Alpha)
dof3 <- s_size3 - 1   # Degrees of Freedom

# Calculate critical t-value for a two-tailed test
critical_t3 <- c(qt(alpha3 / 2, dof3),
                  qt(1-alpha3 / 2, dof3))

print(critical_t3)
## [1] -2.677793  2.677793
# Calculate t-Value
t_value3 <- (s_mean3 - p_mean3) / (s_sd3 / sqrt(s_size3))
print(t_value3)
## [1] -2.040408
# Use the T value to get the P Value
p_value3 <- 2 * pt(-abs(t_value3), dof3)

print(p_value3)
## [1] 0.04660827
# Reject or accept null hypothesis based on the results
if (t_value3 < critical_t3[1] | 
    t_value3 > critical_t3[2]) {
  print("Reject the null hypothesis")
} else {
  print("Fail to reject the null hypothesis")}
## [1] "Fail to reject the null hypothesis"
if (p_value3 < alpha3) {
    cat("Reject the null hypothesis")
  } else {
    cat(" Fail to reject the null hypothesis")}
##  Fail to reject the null hypothesis

Here we can see we fail to reject the Null Hypothesis with a 99% confidence level:

  • Our T value (-2.04) is between the two critical T values (-2.68 and 2.68)

  • Our P Value (0.05) was greater than the Alpha (0.01)

Part 4:

publisher reports that 36% of their readers own a laptop. A marketing executive wants to test the claim that the percentage is actually less than the reported percentage. A random sample of 100 found that 29% of the readers owned a laptop. Is there sufficient evidence at the 0.02 level to support the executive’s claim? Show all work and hypothesis testing steps.

Null Hypothesis: 36% of readers own a laptop

Alternative Hypothesis: Less than 36% of readers own a laptop

One tailed test as they are testing is less than 36% own a laptop

Sample greater than 30 so we can use Z test, no Standard deviation given

_______________________________

Given

Population mean: 0.36

Sample size: 100

Sample mean: 0.29

Level of significance: 0.02

p_mean4 <- 0.36        # Population mean
s_mean4 <- 0.29        # Sample mean, X bar
s_size4 <- 100         # Sample size
alpha4 <- 0.02         # Level of significance (Alpha)

# Calculate critical Z-value for a one-tailed test
critical_z4 <- qnorm(alpha4)

print(critical_z4)
## [1] -2.053749
# Standard Error
Stand_error4 <- sqrt((p_mean4 * (1 - p_mean4)) / s_size4)

# Calculate Z-Value, use standard error above
z_value4 <- (s_mean4 - p_mean4) / Stand_error4
print(z_value4)
## [1] -1.458333
# Reject or accept null hypothesis based on the results
if (z_value4 < critical_z4) {
  cat("Reject the null hypothesis")
} else {
  cat("Fail to reject the null hypothesis")}
## Fail to reject the null hypothesis

Here we can see we fail to reject the Null Hypothesis with a 98% confidence level:

  • Our Z value (-1.46) is greater than the critical T value (-2.05)

Part 5:

A hospital director is told that 31% of the treated patients are uninsured. The director wants to test the claim that the percentage of uninsured patients is less than the expected percentage. A sample of 380 patients found that 95 were uninsured. Make the decision to reject or fail to reject the null hypothesis at the 0.05 level. Show all work and hypothesis testing steps.

Null Hypothesis: 31% of the treated patients are uninsured

Alternative Hypothesis: Less than 31% of the treated patients are uninsured

One tailed test as they are testing is less than 31% of the treated patients are uninsured

Sample greater than 30 so we can use Z test, no Standard deviation given, calculate sample mean

_______________________________

Given

Population mean: 0.31

Sample size: 380

Sample mean: 95/380 = 0.25

Level of significance: 0.05

p_mean5 <- 0.31        # Population mean
s_mean5 <- 95/380      # Sample mean, X bar
s_size5 <- 380         # Sample size
alpha5 <- 0.05         # Level of significance (Alpha)

# Calculate critical Z-value for a one-tailed test
critical_z5 <- qnorm(alpha5)

print(critical_z5)
## [1] -1.644854
# Standard Error
Stand_error5 <- sqrt((p_mean5 * (1 - p_mean5)) / s_size5)

# Calculate Z-Value, use standard error above
z_value5 <- (s_mean5 - p_mean5) / Stand_error5
print(z_value5)
## [1] -2.528935
# Reject or accept null hypothesis based on the results
if (z_value5 < critical_z5) {
  cat("Reject the null hypothesis")
} else {
  cat("Fail to reject the null hypothesis")}
## Reject the null hypothesis

Here we can see we Reject the Null Hypothesis with a 95% confidence level:

  • Our Z value (-2.53) is less than the critical T value (-1.64)

Part 6:

Null Hypothesis: The Standard deviation has remained the same at 24

Alternative Hypothesis: The Standard deviation has decreased below 24

One tailed test as they are testing if the standard deviation has decreased below 24

Sample less than than 30 so we have to use student T

_______________________________

Given

Population mean: 112

Population Standard Deviation: 24

Sample size: 22

Sample mean: 102

Sample Standard Deviation: 15.4387

Level of significance: 0.01

p_mean6 <- 112        # Population mean
p_sd6 <- 24           # Population Standard deviation
s_mean6 <- 102        # Sample mean, X bar
s_size6 <- 22         # Sample size
s_sd6 <- 15.4387      # Sample Standard deviation
alpha6 <- 0.1        # Level of significance (Alpha)
dof6 <- s_size6 - 1   # Degrees of Freedom

# Crital Value using Chi Square
critical_X6 <- qchisq(alpha6, dof6)

print(critical_X6)
## [1] 13.2396
# Calculate the test statistic X, convert SD to Variances

X6 <- (dof6*s_sd6^2)/p_sd6^2

print(X6)
## [1] 8.68997
# Check for significance
if (X6 > critical_X6) {
  cat("Reject the null hypothesis")
} else {
  cat("Fail to reject the null hypothesis")}
## Fail to reject the null hypothesis

Here we can see we fail reject the Null Hypothesis with a 90% confidence level:

  • Our X value (8.69) is less than the critical X value (13.2), on this left side tail

Part 7:

A medical researcher wants to compare the pulse rates of smokers and non-smokers. He believes that the pulse rate for smokers and non-smokers is different and wants to test this claim at the 0.1 level of significance. The researcher checks 32 smokers and finds that they have a mean pulse rate of 87, and 31 non-smokers have a mean pulse rate of 84. The standard deviation of the pulse rates is found to be 9 for smokers and 10 for non-smokers. Let πœ‡1 be the true mean pulse rate for smokers and πœ‡2 be the true mean pulse rate for non- smokers. Show all work and hypothesis testing steps

Null Hypothesis: Smokers and Non Smokers have the same average pulse

Alternative Hypothesis: Smokers and Non Smokers have a different average pulse

Two tailed test as they are testing if the standard deviation has decreased below 24

We can use a 2 sample independent T test

_______________________________

Given

Smokers size: 32

Smokers mean: 87

Smoker Standard Deviation: 9

Non Smokers size: 31

Non Smoker mean: 84

Non Smoker Standard Deviation: 10

Level of significance: 0.01

sm_size1 <- 32                    # Smoker sample size   
ns_size2 <- 31                    # Non Smoker sample size 
sm_mean1 <- 87                    # Smoker mean
ns_mean2 <- 84                    # Non Smoker mean
sm_sd1 <- 9                       # Smoker Standard Deviation
ns_sd2 <- 10                      # Non Smoker Standard Deviation
alpha7 <- 0.1                     # Alpha
dof7 <- sm_size1 + ns_size2 - 2   # Degrees of Freedom

# Calculate the pooled standard deviation
pooled_SD <- sqrt(((sm_size1 - 1) * sm_sd1^2 + (ns_size2 - 1) * ns_sd2^2) / (sm_size1 + ns_size2 - 2))

# Calculate the standard error
SE7 <- sqrt(sm_sd1^2/sm_size1 
            + ns_sd2^2/ns_size2)

# Calculate critical t-values for a two-tailed test
critical_t7 <- c(qt(alpha7 / 2, dof7),
                  qt(1-alpha7 / 2, dof7))

print(critical_t7)
## [1] -1.670219  1.670219
# Calculate the t-value
t_value7 <- (sm_mean1 - ns_mean2) / SE7

print(t_value7)
## [1] 1.25032
# Reject or accept null hypothesis based on the results
if (t_value7 < critical_t7[1] | 
    t_value7 > critical_t7[2]) {
  print("Reject the null hypothesis")
} else {
  print("Fail to reject the null hypothesis")}
## [1] "Fail to reject the null hypothesis"

Here we can see we Fail to reject the Null Hypothesis with a 99% confidence level:

  • Our T value (1.25) is between the two critical T values (-1.67 and 1.67)

Part 8:

Given two independent random samples with the following results:

Use this data to find the 95% confidence interval for the true difference between the
population means. Assume that the population variances are not equal and that the two
populations are normally distributed

Use this formula for Degrees of Freedom: 𝑑𝑓 = π‘šπ‘–π‘›(𝑛1 βˆ’ 1, 𝑛2 βˆ’ 1)

sizeA <- 11                     # A sample size   
sizeB <- 18                     # B sample size 
meanA <- 127                    # A mean
meanB <- 157                    # B mean
sdA <- 33                       # A Standard Deviation
sdB <- 27                       # B Standard Deviation
alpha8 <- 0.05                  # Alpha
dof8 <- min(sizeA-1, sizeB-1)   # Degrees of Freedom

# Calculate the standard error
SE8 <- sqrt((sdA^2 / sizeA) + (sdB^2 / sizeB))

print(SE8)
## [1] 11.81101
# Calculate the t-value for a 95% confidence interval
critical_t8 <- qt(alpha8 / 2, dof8)

print(critical_t8)
## [1] -2.228139
# Calculate the margin of error
margin_error8 <- critical_t8 * SE8

print(margin_error8)
## [1] -26.31657
# Calculate the confidence interval
conf_lower8 <- (meanA - meanB) - margin_error8
conf_upper8 <- (meanA - meanB) + margin_error8

print(conf_lower8)
## [1] -3.683426
print(conf_upper8)
## [1] -56.31657

Part 9:

Two men, A and B, who usually commute to work together decide to conduct an
experiment to see whether one route is faster than the other. The men feel that their
driving habits are approximately the same, so each morning for two weeks one driver is
assigned to route I and the other to route II. The times, recorded to the nearest minute, are
shown in the following table.


- Using this data, find the 98% confidence interval for the true mean difference between the
average travel time for route I and the average travel time for route II.
- Let 𝑑 = (π‘Ÿπ‘œπ‘’π‘‘π‘’ 𝐼 π‘‘π‘Ÿπ‘Žπ‘£π‘’π‘™ π‘‘π‘–π‘šπ‘’) βˆ’ (π‘Ÿπ‘œπ‘’π‘‘π‘’ 𝐼𝐼 π‘‘π‘Ÿπ‘Žπ‘£π‘’π‘™ π‘‘π‘–π‘šπ‘’).
- Assume that the populations of travel times are normally distributed for both routes. Show all
work and hypothesis testing steps.

Null Hypothesis: both routs are the same speed

Alternative Hypothesis: both routs are not the same speed

route1 <- c(32, 27, 34, 24, 31, 25, 30, 23, 27, 35)
route2 <- c(28, 28, 33, 25, 26, 29, 33, 27, 25, 33)
difference9 <- route1 - route2

size_rA <- 10
size_rB <- 10
alpha9 <- 0.02                                # 98% confidence level
VarA9 <- sd(route1)^2
VarB9 <- sd(route2)^2

numdf9 = (VarA9/10 + VarB9/10)^2                       
dendf9 = (VarA9/10)^2 / 9 + (VarB9/10)^2 / 9
dof9 <- numdf9 / dendf9                       # Degrees of Freedom

# Calculate the mean and standard deviation of the differences
mean_diff <- mean(difference9)
sd_diff <- sd(difference9)

# Calculate the standard error of the mean difference
SE9 <- sqrt((var(route1)/10)+(var(route2)/10))

print(SE9)
## [1] 1.678955
# Calculate the t-value for a 98% confidence interval
critical_t9 <- qt(alpha9 / 2, dof9)

print(critical_t9)
## [1] -2.568883
# Calculate the margin of error
margin_error9 <- critical_t9 * SE9

# Calculate the confidence interval for the mean difference
conf_lower9 <- mean_diff - margin_error9
conf_upper9 <- mean_diff + margin_error9

print(conf_lower9)
## [1] 4.413039
print(conf_upper9)
## [1] -4.213039

Part 10:

Null Hypothesis: The percentage of registered voters among employed workers is equal to or less than the percentage among unemployed workers

Alternative Hypothesis: The percentage of registered voters among employed workers exceeds the percentage among unemployed workers

sizeE <- 391
voteE <- 195
sizeUE <- 510
voteUE <- 193
alpha10 <- 0.05

# Calculate sample proportions
p1_Emp <- voteE / sizeE
p2_unEmp <- voteUE / sizeUE

# Calculate pooled proportion
p_pooled <- (voteE + voteUE) / (sizeE + sizeUE)

# Calculate standard error
SE10 <- sqrt(p_pooled * (1 - p_pooled) * (1 / sizeE + 1 / sizeUE))

print(SE10)
## [1] 0.03328424
# Find the critical z-value for a one-tailed test, right tailed
critical_z10 <- qnorm(1-alpha10)

print(critical_z10)
## [1] 1.644854
# Calculate the test statistic (z-score)
z_value10 <- (p1_Emp - p2_unEmp) / SE10

print(z_value10)
## [1] 3.614018
# Make the decision
if (z_value10 > critical_z10) {
  cat("Reject the null hypothesis")
} else {
  cat("Fail to reject the null hypothesis")}
## Reject the null hypothesis

Here we can reject the Null Hypothesis:

  • Our Z value (3.61) is greater than our critical Z value (2.64) in our right tailed test