Task Chapter 4 ~ A/B Testing
Lab5 ~ Hypothesis Testing
| Kontak | \(\downarrow\) |
| naftaligunawan@gmail.com | |
| https://www.instagram.com/nbrigittag/ | |
| RPubs | https://rpubs.com/naftalibrigitta/ |
| Nama | Naftali Brigitta Gunawan |
| NIM | 20214920002 |
Exercise 1
In the built-in data set named immer, the barley yield
in years 1931 and 1932 of the same field are recorded. The yield data
are presented in the data frame columns \(Y_1\) and \(Y_2\). Assuming that the data in
immer follows the normal distribution, find the 95%
confidence interval estimate of the difference between the mean barley
yields between years 1931 and 1932.
Estimate the difference between the means of matched samples using your textbook formula.
Answer :
library(MASS) # load the MASS package
DT::datatable(immer) # load your data set in a tableY_1 = immer$Y1
Y_2 = immer$Y2
beda = Y_1-Y_2
n = length(beda) # find the length of data
d = mean(beda) # find the y bar or mean
sd = sqrt((1/(n-1))*sum(beda^2)) # find the standard deviation
a = 1-0.95 ; a # find the alpha## [1] 0.05
t = qt(1-(a/2), df = n-1) # t-test
E = c(-t,t) * (sd/sqrt(n)); E # find the margin of error## [1] -11.50642 11.50642
CI = round(d+E, digits = 2) ; CI # find the interval## [1] 4.41 27.42
So, the conclusion is the Confidence Interval are between \(4,41 ≤ μ_d ≤ 27,42\).
Exercise 2
In the data frame column mpg of the data set mtcars,
there are gas mileage data of various 1974 U.S. automobiles. Meanwhile,
another data column in mtcars, named am, indicates the
transmission type of the automobile model (0 = automatic, 1 = manual).
In particular, the gas mileage for manual and automatic transmissions
are two independent data populations. Assuming that the data in
mtcars follows the normal distribution, find the 95%
confidence interval estimate of the difference between the mean gas
mileage of manual and automatic transmissions.
Estimate the difference between two population proportions using your textbook formula.
Answer :
library(MASS) # load the MASS package
DT::datatable(mtcars) # load your data set in a tableL = mtcars$am == 0
Xauto = mean(mtcars[L,]$mpg)
Xmanual = mean(mtcars[!L,]$mpg)
Sauto = sd(mtcars[L,]$mpg)
Smanual = sd(mtcars[!L,]$mpg)
n.auto = length(mtcars[L,]$mpg)
n.manual = length(mtcars[!L,]$mpg)
alpha = 1 - 0.95
diff = Xauto - Xmanual; diff## [1] -7.244939
t. = qt(1-(alpha/2), df = n.auto+n.manual-2)
t = (diff) / (sqrt(((((n.auto-1)*Sauto^2)+((n.manual-1)*Smanual^2)) / (n.manual+n.auto - 2))*((1/n.manual)+(1/n.auto))))
t## [1] -4.106127
lower = diff - t. * sqrt((Sauto)^2/(n.auto)+(Smanual)^2/(n.manual))
lower## [1] -11.17264
upper = diff + t. * sqrt((Sauto)^2/(n.auto)+(Smanual)^2/(n.manual))
upper## [1] -3.317237
So, the conclusion are the Difference is \(-7.24\) and the Confidence Interval are between \(-11.17 ≤ μ_d ≤ -3.32\).
Exercise 3
In the built-in data set named quine, children from an
Australian town is classified by ethnic background, gender, age,
learning status and the number of days absent from school. In effect,
the data frame column Eth indicates whether the student is
Aboriginal or Not (“A” or “N”), and the column Sex indicates Male or
Female (“M” or “F”). Assuming that the data in quine follows the normal
distribution, find the 95% confidence interval estimate of the
difference between the female proportion of Aboriginal students and the
female proportion of Non-Aboriginal students, each within their own
ethnic group.
Estimate the difference between two population proportions using your textbook formula.
Answer :
library(MASS) # load the MASS package
DT::datatable(quine) # load your data set in a tabletable(quine$Eth, quine$Sex)##
## F M
## A 38 31
## N 42 35
library(dplyr)
aus = quine %>%
count(Eth, Sex)
aus = data.frame(
"Note" = c("A","N","Total"),
"F" = c(38, 42, 38+42),
"M" = c(31, 35, 31+35),
"Total" = c(38+31, 42+35, 38+31+42+35)); aus# 1 for A, and 2 for N
phat1 = 38/69
n1 = 60
phat2 = 42/77
n2 = 77
a = 0.05
pe = phat1-phat2 # point estimate
x = qnorm(0.975) * sqrt(((phat1*(1-phat1))/n1)+((phat1*(1-phat1))/n2))
CI = round(pe + c(-x,x), digits = 3); CI## [1] -0.163 0.173
So, the conclusion is Confidence Interval are betwen \(-0.163 ≤ μ_d ≤ 0.173\).