Task 3 ~ Hypothesis Testing
Lab4 ~ Hypothesis Testing
| Kontak | \(\downarrow\) |
| naftaligunawan@gmail.com | |
| https://www.instagram.com/nbrigittag/ | |
| RPubs | https://rpubs.com/naftalibrigitta/ |
| Nama | Naftali Brigitta Gunawan |
| NIM | 20214920002 |
Information About Hypothesis (IAH)
Three type of hypothesis are:
A. \(H_0\) : \(μ=μ_0\) vs \(H_1\) : \(μ≠μ_0\). # Two tailed
B. \(H_0\) : \(μ≤μ_0\) vs \(H_1\) : \(μ>μ_0\). # Right tailed
C. \(H_0\) : \(μ≥μ_0\) vs \(H_1\) : \(μ<μ_0\). # Left tailed
Test decision:
Reject \(H_0\) if for the observed value \(z\) of \(Z\):
\(z<z_{α/2}\) or \(z>z_{1−α/2}\)
\(z>z_{1−α}\)
\(z<z_α\)
P-value:
p value A = \(ρ=2ϕ(−|z|)\)
p value B = \(ρ=1−ϕ(z)\)
p value C = \(ρ=ϕ(z)\)
Exercise 1
Right Tail: A food company argue that for each a cookie bag of their products, there is at most 2 grams of saturated fat in a single cookie. In a sample of 40 cookies, it is found that the mean amount of saturated fat per cookie is 2.1 grams. Assume that the population standard deviation is 0.25 grams. At .05 significance level, can we reject the claim?
Answer:
\(H_0 = μ ≤ 2\). Since, we know the sigma, then we use z statistics. After that, we can found the critical value at 0.05 significance level.
# Known
xbar= 2.1 # sample mean
sigma = 0.25 # Population standard deviation)
m0 = 2 # hypothesized value
n = 40 # sample size
# Answer
z = (xbar - m0)/(sigma/sqrt(n)); z # Z-test statistics ## [1] 2.529822
a = 0.05 # alpha
z.alpha = qnorm(1-a); z.alpha # critical value## [1] 1.644854
So, the result are:
The Z-test statistics 2.529822 is greater that the Z-alpha of 1.644854.
We used right tailed test: if \(Z_{alpha} > Z\) then reject \(H_0\) at 5% level of significance.
We
rejectthe claim that"There is at most 2 grams of saturated fat in a cookie".
Alternative Solution: Instead of using the critical value, we apply the pnorm function to compute the upper tail p-value of the test statistic. As it turns out to be less than the 5% significance level, we reject the null hypothesis that \(μ ≤ 2\).
pval = 1- pnorm (z); pval # Upper tail value## [1] 0.005706018
Exercise 2
To test the hypothesis that the mean systolic blood pressure in a certain population equals 140 mmHg. The standard deviation has a known value of 20 and a data set of 55 patients is available.
Answer:
# Blood_pressure dataset
no <- seq(1:55)
status <- c(rep(0, 25), rep(1, 30))
mmhg <- c(120,115,94,118,111,102,102,131,104,107,115,139,115,113,114,105,
115,134,109,109,93,118,109,106,125,150,142,119,127,141,149,144,
142,149,161,143,140,148,149,141,146,159,152,135,134,161,130,125,
141,148,153,145,137,147,169)
blood_pressure <- data.frame(no,status,mmhg); blood_pressureThe first one, we calculate the sample mean and find the total of sample size. Then we set the \(H_0\) and sigma to find z. Then, we also search p value A, B, and C.
xbar = mean(blood_pressure$mmhg) # sample mean
n = length(blood_pressure$mmhg) # total sample size
m0 = 140 # m0
sigma = 20 # sigma or standard deviation
z = (xbar-m0)/(sigma/sqrt(n)); z # z-statistics## [1] -3.708099
# p value A, B, and C
pvalueA = 2*pnorm(-abs(z)); pvalueA # two tail## [1] 0.0002088208
pvalueB = 1- pnorm (z); pvalueB # right tail## [1] 0.9998956
pvalueC = pnorm(z); pvalueC # left tail## [1] 0.0001044104
So, the result are:
\(z\) = -3.708099
pvalueA = 0.0002088208
pvalueB = 0.9998956
pvalueC = 0.0001044104
p value A and p value Cis less than 0.05 significance level, so werejectthe null hypothesis (H_0) or \(H_0\) : \(μ≥μ_0\) or \(μ≥140\).p value Bis greater than 0.05 significance level, so weacceptthe null hypothesis (H_0) or \(H_0\) : \(μ≤μ_0\) or \(μ≤140\).
Exercise 3
Right tail: Garuda-food Indonesia claims that for each a cookie bag states of their product, there is at most 2 grams of saturated fat in a single cookie. In a sample of 40 cookies, it is found that the mean amount of saturated fat per cookie is 2.1 grams. Assume that the sample standard deviation is 0.3 gram. At .05 significance level, can we reject the claim?
Answer:
xbar= 2.1 # sample mean
m0 = 2 # Hypothesized value
sigma = 0.3 # sigma or standard deviation
n = 40 # Sample size
t = (xbar-m0)/(sigma/sqrt(n)); t # t-test## [1] 2.108185
# Find the critical value
alpha = 0.05
t.alpha = qt(1-alpha, df = n-1); t.alpha## [1] 1.684875
The t-test result is 2.108185 is greater than the critical value result 1.684875. So, at the 0.05 significance level, we reject the claim that there is at most 2 grams of saturated fat in a cookie
Alternative Solution :
pval = pt(t, df= n-1, lower.tail = FALSE); pval## [1] 0.020746
The result: . So, the result are:
p value is 0.020746 is less than 0.05 significance level or \(H_0\) : \(μ≤μ_0\) or \(μ≤2\).
We
rejectthe null hypothesis.
Exercise 4
To test the hypothesis that the mean systolic blood pressure in a certain population equals 140 mmHg. The dataset at hands has measurements on 55 patients.
Answer:
#Blood_pressure dataset
no <- seq(1:55)
status <- c(rep(0, 25), rep(1, 30))
mmhg <- c(120,115,94,118,111,102,102,131,104,107,115,139,115,113,114,105,
115,134,109,109,93,118,109,106,125,150,142,119,127,141,149,144,
142,149,161,143,140,148,149,141,146,159,152,135,134,161,130,125,
141,148,153,145,137,147,169)
blood_pressure <-data.frame(no,status,mmhg); blood_pressurexbar = mean(blood_pressure$mmhg); xbar # to find the xbar, xbar= 130## [1] 130
# Find sigma or SE
library(dplyr)
new_blood_pressure <- blood_pressure%>% mutate(xi_minus_xbar = mmhg-130, xi_minus_xbar_sq = (mmhg-130)^2); new_blood_pressureFirst, we made new column in blood_pressure data set and we name it new_blood_pressure data set. This new data have 2 new columns, which are xi_minus_xbar and xi_minus_xbar_sq. The purposes for this new columns are to find the sigma or SE.
xbar = mean(new_blood_pressure$mmhg) # sample mean
n = length(new_blood_pressure$mmhg) # total sample size
m0 = 140 # Hypothesized value
q = sum(new_blood_pressure$xi_minus_xbar_sq)
sigma = sqrt((1/(n-1))*q) # sigma or standard deviation
t = (xbar-m0)/(sigma/sqrt(n)); t # t-test statistics## [1] -3.869272
# p value A, B, and C
pvalueA = 2*pt(abs(t), df = n-1, lower.tail=FALSE); pvalueA # two tail## [1] 0.0002961114
pvalueB = pt(t, df= n-1, lower.tail = FALSE); pvalueB # right tail## [1] 0.9998519
pvalueC = pt(t, df= n-1, lower.tail = TRUE); pvalueC # left tail## [1] 0.0001480557
So, the result are:
\(z\) = -3.869272
pvalueA = 0.0002961114
pvalueB = 0.9998519
pvalueC = 0.0001480557
p value A and p value Cis less than 0.05 significance level, so werejectthe null hypothesis (H_0) or \(H_0\) : \(μ≥μ_0\) or \(μ≥140\).p value Bis greater than 0.05 significance level, so weacceptthe null hypothesis (H_0) or \(H_0\) : \(μ≤μ_0\) or \(μ≤140\).
Exercise 5
Right tail: Garuda-food Indonesia claims that for each a cookie bag states of their product, there is at most 2 grams of saturated fat in a single cookie. Assume the actual mean amount of saturated fat per cookie is 2.075 grams and the sample standard deviation is 0.25 grams. At .05 significance level, what is the probability of having a type II error for a sample size of 35 cookies?
Answer:
We begin by defining the sample size, population standard deviation, and standard error.
n = 35 # sample size
sigma = 0.25 # sigma or standard deviation
SE = sigma/sqrt(n); SE # Standard error## [1] 0.04225771
Next, we compute the upper bound of sample means which is the null hypothesis \(μ≤2\) would not be rejected.
alpha = 0.05 # Significance level
# Hypothesis
m000 = 2
q = qnorm(alpha , mean = m000, sd = SE, lower.tail = FALSE); q## [1] 2.069508
If the sample mean is less than 2.069508 in a hypothesis test, the null hypothesis will not be rejected. Since we assume that the actual population mean is 2.075, we can compute the probability of the sample mean below 2.0695, and found the probability of type II error.
# Actual mean
m0000 = 2.075
pnorm(q, mean = m0000, sd = SE)## [1] 0.448295
If the cookies sample size is 35, the actual mean amount of saturated fat per cookie is 2.075 grams and the population standard deviation is 0.25 grams, then the probability of type II error for testing the null hypothesis \(μ ≤ 2\) at 0.05 significance level is 44.8%, and the power of the hypothesis test is 55.2%.
Exercise 6
Under same assumptions as case 27, if actual mean population weight is 14.9 kg, what is the probability of type II errors? What is the power of the hypothesis test?
Answer:
We begin by defining the sample size, population standard deviation, and standard error.
n = 35 # sample size
sigma = 2.5 # sigma or standard deviation
SE = sigma/sqrt(n); SE # standard error## [1] 0.4225771
Next, we compute the lower and upper bounds of sample means which the null hypothesis \(μ = 14.9\) would not be rejected.
alpha= 0.05 # significance level
m0 = 15.4 # hypothetical mean
I = c(alpha/2, 1- alpha/2)
q= qnorm(I, mean = m0, sd = SE); q## [1] 14.57176 16.22824
So, we know that as long as the sample mean is between 14.57176 and 16.22824, the null hypothesis will not be rejected. Since, we assume that the actual population mean is 14.9, we compute the lower tail probabilities of both end points
# Actual mean
m0 = 14.9
p = pnorm(q, mean = m0, sd = SE); p## [1] 0.2186537 0.9991644
Finally, the probability of type II error is the probability between the two end points
# p[2]- p[1]
diff(p)## [1] 0.7805107
If, the penguin sample size is 35, the actual mean population weight is 14.9 kg and the population standard deviation is 2.5 kg, then the probability of type II error for testing the null hypothesis \(μ = 14.9\) at 0.05 significance level is 78.1%, and the power of the hypothesis test is 21.9%.