Setting up Functions
#function to reject or not
myp=function(p, alpha){
if(p<alpha){print('REJECT Ho')}else{print('FAIL 2 REJECT')}
}
1) Using traditional methods, it takes 109 hours to receive a basic
driving license. A new license training method using Computer Aided
Instruction (CAI) has been proposed. A researcher used the technique
with 190 students and observed that they had a mean of 110 hours. Assume
the standard deviation is known to be 6. A level of significance of 0.05
will be used to determine if the technique performs differently than the
traditional method. Make a decision to reject or fail to reject the null
hypothesis. Show all work in R.
Method 1 Test statistic vs critical value. If test statistic is
larger than critical value in absolute terms, reject the null
# Choose level
alp <- 0.05
# Take sample
#Compute test statistic
z <- (110-109)/(6/sqrt(190))
test_sta <- z
test_sta
## [1] 2.297341
critical_value <- qnorm(p = 0.975,
mean = 0,
sd = 1
)
critical_value
## [1] 1.959964
Since the test statistic is more extreme than the critical value, we
will reject the null.
Method 2 Compare the P value with alpha. If the p value is smaller
than alpha, reject the null
p_value <- 2 * (1-pnorm(q = test_sta, mean = 0,
sd = 1 )
)
p_value
## [1] 0.0215993
myp(p = p_value, alpha = alp)
## [1] "REJECT Ho"
Since the pvalue is less than alpha, we reject the null
hypothesis.
Method 3 Confidence interval constructed from the sample point
estimate contains tht hypothesized values ?if not, reject the null.
xbar <- 110
Se <- 6/sqrt(190)
z <- pnorm(q = 0.975)
interval = c(xbar - z * Se, xbar + z * Se)
interval
## [1] 109.6364 110.3636
2) Our environment is very sensitive to the amount of ozone in the
upper atmosphere. The level of ozone normally found is 5.3 parts/million
(ppm). A researcher believes that the current ozone level is at an
insufficient level. The mean of 5 samples is 5.0 ppm with a standard
deviation of 1.1. Does the data support the claim at the 0.05 level?
Assume the population distribution is approximately normal.
Ho: data does not support the claim Ha: data supports the claim
#take sample
xbar2 <- 5
Se2 <- 1.1/sqrt(5)
z2 <- pnorm(q = 0.975)
interval2 = c(xbar2 - z2 * Se2, xbar2 + z2 * Se2 )
interval2
## [1] 4.589126 5.410874
Since 5.3 is whitin the interval We can not reject the null.
3) Our environment is very sensitive to the amount of ozone in the
upper atmosphere. The level of ozone normally found is 7.3 parts/million
(ppm). A researcher believes that the current ozone level is not at a
normal level. The mean of 51 samples is 7.1 ppm with a variance of 0.49.
Assume the population is normally distributed. A level of significance
of 0.01 will be used. Show all work and hypothesis testing steps.
Ho: Data does not support the claim Ha: data supports the
claim.
xbar3 <- 7.1
Se3 <- sqrt(0.49)/sqrt(51)
z3 <- pnorm(0.995)
interval = c(xbar3 - z3 * Se3, xbar3 + z3 * Se3)
interval
## [1] 7.017651 7.182349
Since 7.4 is ouside of the confindece interval we can reject the
null.
4) A publisher reports that 36% of their readers own a laptop. A
marketing executive wants to test the claim that the percentage is
actually less than the reported percentage. A random sample of 100 found
that 29% of the readers owned a laptop. Is there sufficient evidence at
the 0.02 level to support the executiveβs claim? Show all work and
hypothesis testing steps
Ho: % of reader who own a laptop is 36% Ha: The % of readers who own
a laptop is actually less than 36%
T-distribution
n <- 100
actual <- 0.36
test <- 0.29
#Degrees of freeodom
df <- n - 1
Se4 <- sqrt((actual*(1-actual))/n)
Se4
## [1] 0.048
#calculate z value with alpha
alpha4 <- 0.02
z4 <- qnorm(1-alpha4/2)
z4
## [1] 2.326348
interval = c(test - z4 *Se4, test + z4 * Se4)
interval
## [1] 0.1783353 0.4016647
With a 98% confindece interval and 36% being inside ours intervals
we cannot reject the null.
5)A hospital director is told that 31% of the treated patients are
uninsured. The director wants to test the claim that the percentage of
uninsured patients is less than the expected percentage. A sample of 380
patients found that 95 were uninsured. Make the decision to reject or
fail to reject the null hypothesis at the 0.05 level. Show all work and
hypothesis testing steps.
Ho: % of uninsured patients is 31% Ha:% of uninsured patitents is
actually less than 31%
actual5 <- 0.31
n5 <- 380
# We need a % not a number 95 out of 380 is 23.684210526315788 %
test5 <- 23.684210526315788
Se5 <- sqrt((actual5*(1-actual5))/n5)
Se5
## [1] 0.0237254
#calculate z with alpha
alpha5 <- 0.05
z5 <- qnorm(1-alpha5/2)
z5
## [1] 1.959964
interval = c(test5 - z5 * Se5, test5 + z5 * Se5)
interval
## [1] 23.63771 23.73071
With a 98% confindece interval, and 23.68 is inside our intervals we
cannot reject the null.
6)
We cannot reject the null since 15.4387 is with in the
intervals.
8) A medical researcher wants to compare the pulse rates of smokers
and non-smokers. He believes that the pulse rate for smokers and
non-smokers is different and wants to test this claim at the 0.1 level
of significance. The researcher checks 32 smokers and finds that they
have a mean pulse rate of 87, and 31 non-smokers have a mean pulse rate
of 84. The standard deviation of the pulse rates is found to be 9 for
smokers and 10 for non-smokers. Let π1 be the true mean pulse rate for
smokers and π2 be the true mean pulse rate for non-smokers. Show all
work and hypothesis testing steps.
Ho: pulse rate are not the same for smooker and non-smokers Ha:
Pulse rate are the same for smoker and no smokers
nsmoker <- 32
meansmoker <- 87
nnosmoker <- 31
meannosmoker <- 84
sdsmoker <- 9
sdnosmoker <- 10
df8nosmoker <- nnosmoker -1
df8smoker <- nsmoker -1
varsmoker <- sdsmoker^2
varnosmoker <- sdnosmoker^2
alpha8 <- 0.1
calcualte differences between the means
meandiff <- meansmoker - meannosmoker
meandiff
## [1] 3
Calculate standard error using samplen standard deviation
Se8 <- sqrt( varsmoker/nsmoker + varnosmoker/nnosmoker)
Se8
## [1] 2.399387
## Calculate T test statistic
t8 <- (meandiff - 0 )/Se8
test_stat8 <- t8
test_stat8
## [1] 1.25032
Calculate df for both samples
numdf <- (varsmoker/nsmoker + varnosmoker/nnosmoker)^2
dendf <- (varsmoker/nsmoker)^2 / df8smoker + (varnosmoker/nnosmoker)^2 / df8nosmoker
df8 <- numdf / dendf
df8
## [1] 59.87528
Calculate critival value
critical8 <- qt(p = (1-alpha8/2), df = numdf / dendf )
critical8
## [1] 1.670703
Since critical value 1.67 is more extreme than the test statitics
1.25 we can fail to reject the null.
9) Given two independent random samples with the following
results:π1 = 11, π₯Μ
1 = 12,7 π 1 = 33, π2 = 18, π₯Μ
2 = 15, π 2 = 27
Use this data to find the 95% confidence interval for the true
difference between the population means. Assume that the population
variances are not equal and that the two populations are normally
distributed.
n91 <- 11
x91 <- 127
s91 <- 33
n92 <- 18
x92 <- 157
s92 <- 27
var91 <- x91^2
var92 <- x92^2
alpha9 <- 0.05
# Calculate mean diff
meandiff9 <- x91 - x92
meandiff9
## [1] -30
#Calculate standard error using sample standard deviation
Se9 <- sqrt( var91/n91 + var92/n92)
Se9
## [1] 53.25093
## Calculate t stat
tstat9 <- (meandiff9-0)/Se9
tstat9
## [1] -0.5633704
## Calculate Df for both samples
numdf9 <- (var91/n91 + var92/n92)^2
dendf9 <- (var91/n91)^2 / x91 + (var92/n92)^2 / x92
df9 <- numdf9 / dendf9
df9
## [1] 278.4956
## Calculate T score
tdf9 <- qt(p = alpha9/2, df9, lower.tail = FALSE)
tdf9
## [1] 1.968519
## Confindence interval
interval = c( meandiff9 - tdf9 * Se9, meandiff9 + tdf9 * Se9)
interval
## [1] -134.82545 74.82545
10) Two men, A and B, who usually commute to work together decide to
conduct an experiment to see whether one route is faster than the other.
The men feel that their driving habits are approximately the same, so
each morning for two weeks one driver is assigned to route I and the
other to route II. The times, recorded to the nearest minute, are shown
in the following table.
route1 <- c(32,27,34,24,31,25,30,23,27,35)
route2 <- c(28,28,33,25,26,29,33,27,25,33)
n101 <- 10
x101 <- mean(route1)
sd101 <- sd(route1)
var101 <- x101^2
n102 <- 10
x102 <- mean(route2)
sd102 <- sd(route2)
var102 <- x102^2
#calculate the mean diff
meandiff10 <- x101 - x102
meandiff10
## [1] 0.1
# Calculate standard error using sample standard deviation
Se10 <- sqrt( var101/n101 + var102/n102)
Se9
## [1] 53.25093
## Calculate t stat
tstat10 <- (meandiff10 - 0) / Se10
tstat10
## [1] 0.007777616
## Calculate Df for both samples
numdf10 <- (var101/n101 + var102/n102)^2
dendf10 <- (var101/n101)^2 / x101 + (var102/n102)^2 / x102
df10 <- numdf10 / dendf10
df10
## [1] 57.49983
## Calculate T score
alpha10 <- 0.02
tdf10 <- qt(p = alpha10/2, df10, lower.tail = FALSE)
tdf10
## [1] 2.392967
## Confindence interval
interval = c( meandiff10 - tdf10 * Se10, meandiff10 + tdf10 * Se10)
interval
## [1] -30.66736 30.86736
Since the p value is bigger than alpha we can reject the null