Haiding luo
2023 11 21
1.Using traditional methods, it takes 109 hours to receive a
basic driving license. A new
license training method using Computer Aided Instruction (CAI) has been
proposed. A
researcher used the technique with 190 students and observed that they
had a mean of
110 hours. Assume the standard deviation is known to be 6. A level of
significance of
0.05 will be used to determine if the technique performs differently
than the traditional
method. Make a decision to reject or fail to reject the null hypothesis.
Show all work in
R.
mean_trad <- 109
mean <- 110
sdv <- 6
sample_size <- 190
alpha <- 0.05
z <- (mean - mean_trad) / (sdv / sqrt(sample_size))
p_value <- 2 * (1 - pnorm(abs(z)))
decision <- ifelse(p_value < alpha, "reject", "fail to reject")
decision
## [1] "reject"
z
## [1] 2.297341
p_value
## [1] 0.0215993
The P-value is about 0.022 and is less than 0.05 , we reject the null hypothesis.
2.Our environment is very sensitive to the amount of ozone in
the upper atmosphere.The level of ozone normally found is 5.3
parts/million (ppm). A researcher believes that
the current ozone level is at an insufficient level. The mean of 5
samples is 5.0 ppm
with a standard deviation of 1.1. Does the data support the claim at the
0.05 level?
Assume the population distribution is approximately normal.
mean_pop <- 5.3
mean <- 5.0
sdv <- 1.1
size <- 5
alpha <- 0.05
t <- (mean - mean_pop) / (sdv / sqrt(size))
t
## [1] -0.6098367
p_value <- 2 * pnorm(q =t)
decision <- ifelse(p_value < alpha, "reject", "fail to reject")
decision
## [1] "fail to reject"
p_value
## [1] 0.54197
The P-value is about 0.54 and is greater than 0.05 , we fail reject the null hypothesis.
Extension: Our environment is very sensitive to the amount of
ozone in the upper
atmosphere. The level of ozone normally found is 5.3 parts/million
(ppm). A researcher
believes that the current ozone level is not 5.3 parts/million (ppm).
The mean of 5
samples is 5.0 parts per million (ppm) with a standard deviation of 1.1.
Does the data
support the claim at the 0.05 level? Assume the population distribution
is approximately
normal.
mean_pop <- 5.3
mean <- 5.0
sdv <- 1.1
size <- 5
alpha <- 0.05
t <- (mean - mean_pop) / (sdv / sqrt(size))
t
## [1] -0.6098367
p_value <- pnorm(q=t)
p_value
## [1] 0.270985
decision <- ifelse(p_value < alpha, "reject", "fail to reject")
decision
## [1] "fail to reject"
The P-value is about 0.27 and is greater than 0.05 , we fail reject the null hypothesis.
3.Our environment is very sensitive to the amount of ozone in the upper atmosphere. The level of ozone normally found is 7.3 parts/million (ppm). A researcher believes thatthe current ozone level is not at a normal level. The mean of 51 samples is 7.1 ppm with a variance of 0.49. Assume the population is normally distributed. A level of significance of 0.01 will be used. Show all work and hypothesis testing steps.
mean <- 7.3
mean2 <- 7.1
variance <- 0.49
size <- 51
alpha <- 0.01
sdv <- sqrt(variance)
t_statistic <- (mean2 - mean) / (sdv / sqrt(size))
p_value <- 2 * (1 - pt(abs(t_statistic), df = size - 1))
decision <- ifelse(p_value < alpha, "reject", "fail to reject")
decision
## [1] "fail to reject"
t_statistic
## [1] -2.040408
p_value
## [1] 0.04660827
The P-value is about 0.046 and is greater than 0.01 , we fail reject the null hypothesis.
4.A publisher reports that 36% of their readers own a laptop.
A marketing executive wants to test the claim that the percentage is
actually less than the reported
percentage. A random sample of 100 found that 29% of the readers owned a
laptop.
Is there sufficient evidence at the 0.02 level to support the
executiveโs claim? Show all
work and hypothesis testing steps.
phat <- 0.29
p <- 0.36
n <- 100
alpha <- 0.02
z_score <- (phat - p) / sqrt(p * (1 - p) / n)
p_value <- pnorm(z_score)
p_value
## [1] 0.07237434
z_score
## [1] -1.458333
critical_value <- qnorm(alpha)
critical_value
## [1] -2.053749
decision <- ifelse(p_value < alpha, "reject", "fail to reject")
decision
## [1] "fail to reject"
The P-value is about 0.072 and is greater than 0.02 , we fail reject the null hypothesis.
5.A hospital director is told that 31% of the treated patients are uninsured. The director wants to test the claim that the percentage of uninsured patients is less than the expected percentage. A sample of 380 patients found that 95 were uninsured. Make the decision to reject or fail to reject the null hypothesis at the 0.05 level. Show all work and hypothesis testing steps.
n <- 380
x <- 95
phat <- x / n
p <- 0.31
alpha <- 0.05
z_score <- (phat - p) / sqrt(p * (1 - p) / n)
p_value <- pnorm(z_score)
critical_value <- qnorm(alpha)
decision <- ifelse(p_value < alpha, "reject", "fail to reject")
z_score
## [1] -2.528935
p_value
## [1] 0.005720462
critical_value
## [1] -1.644854
decision
## [1] "reject"
The P-value is about 0.00572 and is less than 0.05 , we reject the null hypothesis.
6.A standardized test is given to a sixth-grade class.
Historically the mean score has been 112 with a standard deviation of
24. The superintendent believes that the standard deviation of
performance may have recently decreased. She randomly sampled 22
students and found a mean of 102 with a standard deviation of 15.4387.
Is there
evidence that the standard deviation has decreased at the ๐ผ = 0.1 level?
Show all work and hypothesis testing steps.
n <- 22
s <- 15.4387
sigma <- 24
alpha = 0.1
chi <- (n - 1) * s^2 / sigma^2
p_value <- pchisq(chi, df = n - 1, lower.tail = TRUE)
chi
## [1] 8.68997
p_value
## [1] 0.008549436
decision <- ifelse(p_value < alpha, "reject", "fail to reject")
decision
## [1] "reject"
The P-value is about 0.00854 and is less than 0.1 , we reject the null hypothesis.
7.A medical researcher wants to compare the pulse rates of
smokers and non-smokers. He
believes that the pulse rate for smokers and non-smokers is different
and wants to test this
claim at the 0.1 level of significance. The researcher checks 32 smokers
and finds that
they have a mean pulse rate of 87, and 31 non-smokers have a mean pulse
rate of 84. The
standard deviation of the pulse rates is found to be 9 for smokers and
10 for non-smokers.
Let ๐1 be the true mean pulse rate for smokers and ๐2 be the true mean
pulse rate for non-
smokers. Show all work and hypothesis testing steps.
mu1 <- 87
mu2 <- 84
sdv1 <- 9
sdv2 <- 10
n1 <- 32
n2 <- 31
alpha = 0.1
var1 = sdv1^2
var2 = sdv2^2
t_statistic <- (mu1 - mu2) / sqrt((sdv1^2 / n1) + (sdv2^2 / n2))
df <- ((var1 / n1 + var2 / n2)^2) /
((var1 / n1)^2 / (n1 - 1) + (sdv2^2 / n2)^2 / (n2 - 1))
p_value <- 2 * pt(-abs(t_statistic), df)
t_statistic
## [1] 1.25032
p_value
## [1] 0.2160473
decision <- ifelse(p_value < alpha, "reject", "fail to reject")
decision
## [1] "fail to reject"
The P-value is about 0.216 and is greater than 0.1 , we fail to reject the null hypothesis.
8.Given two independent random samples with the following results:
๐1 = 11
๐ฅฬ
1 = 127
๐ 1 = 33
๐2 = 18
๐ฅฬ
2 = 157
๐ 2 = 27
Use this data to find the 95% confidence interval for the true
difference between the
population means. Assume that the population variances are not equal and
that the two
populations are normally distributed.
alpha = 0.05
xbar1 = 127
xbar2 = 157
n1 = 11
n2 = 18
df1 = n1-1
df2 = n2-1
s1 = 33
s2 = 27
Se <- sqrt(s1^2 / n1 + s2^2 / n2)
df <- (s1^2 / n1 + s2^2 / n2)^2 / ((s1^2 / n1)^2 / df1 + (s2^2 / n2)^2 / df2)
df
## [1] 18.0759
delta <- xbar1 - xbar2
Se <- sqrt(s1^2 / n1 + s2^2 / n2)
t_critical <- qt(p = 0.975, df = df)
interval <- c(delta - t_critical * Se, delta + t_critical * Se)
interval
## [1] -54.806548 -5.193452
t_critical
## [1] 2.10029
test_statistic <- delta / Se
?pt
## ๆๅผhttpdๅธฎๅฉๆๅกๅจโฆ ๅฅฝไบ
p_value = 2 * ( pt(q = test_statistic, df ))
p_value
## [1] 0.02048034
decision <- ifelse(p_value < alpha, "reject", "fail to reject")
decision
## [1] "reject"
The P-value is about 0.0204 and is less than 0.05 , we reject the null hypothesis.
9.Two men, A and B, who usually commute to work together
decide to conduct anexperiment to see whether one route is faster than
the other. The men feel that their
driving habits are approximately the same, so each morning for two weeks
one driver is
assigned to route I and the other to route II. The times, recorded to
the nearest minute, are
shown in the following table
route1 <- c(32, 27, 34, 24, 31, 25, 30, 23, 27, 35)
route2 <- c(28, 28, 33, 25, 26, 29, 33, 27, 25, 33)
s1 = sd(route1)
s2 = sd(route2)
n1 = 10
n2 = 10
df1 = n1-1
df2 = n2-1
alpha <- 0.02
d <- route1- route2
d_bar <- mean(d)
d_bar
## [1] 0.1
Se = sqrt( s1^2/n1 + s2^2 /n2 )
Se
## [1] 1.678955
df <- length(d) - 1
critical_value <- qt(1 - alpha / 2, df)
critical_value
## [1] 2.821438
interval = c( d_bar - critical_value * Se , d_bar + critical_value * Se )
interval
## [1] -4.637066 4.837066
df <- (s1^2 / n1 + s2^2 / n2)^2 / ((s1^2 / n1)^2 / df1 + (s2^2 / n2)^2 / df2)
t_critical <- qt(p = (1-alpha/2), df = df)
t_critical
## [1] 2.568883
interval = c( d_bar - t_critical* Se ,
d_bar + t_critical* Se )
interval
## [1] -4.213039 4.413039
10.The U.S. Census Bureau conducts annual surveys to obtain
information on the percentage
of the voting-age population that is registered to vote. Suppose that
391 employed
persons and 510 unemployed persons are independently and randomly
selected, and that
195 of the employed persons and 193 of the unemployed persons have
registered to vote.
Can we conclude that the percentage of employed workers ( ๐1 ), who have
registered to
vote, exceeds the percentage of unemployed workers ( ๐2 ), who have
registered to vote?
Use a significance level of ๐ผ = 0.05 for the test. Show all work and
hypothesis testing
steps.
n1 <- 391
n2 <- 510
x1 <- 195
x2 <- 193
alpha=0.05
p1 <- x1 / n1
p2 <- x2 / n2
p <- (x1 + x2) / (n1 + n2)
se <- sqrt(p1 * (1-p1) / n1 + p2 * (1-p2) / n2)
z <- (p1 - p2) / se
p_value <- pnorm(z, lower.tail = FALSE)
p_value
## [1] 0.0001439855
decision <- ifelse(p_value < alpha, "reject", "fail to reject")
decision
## [1] "reject"
The P-value is about 0.00014 and is less than 0.05 , we reject the null hypothesis.