Email             :
RPubs            : https://rpubs.com/aliciaarifin/
Jurusan          : Statistika
Address         : ARA Center, Matana University Tower
                         Jl. CBD Barat Kav, RT.1, Curug Sangereng, Kelapa Dua, Tangerang, Banten 15810.


1 Library

library(MASS)
library(DT)
library(dplyr)

2 Data

DT::datatable(immer) # exercise 18
DT::datatable(quine) # exercise 19

3 Matched Samples [paired t-test]

3.1 Exercise 18

In the built-in data set named immer, the barley yield in years 1931 and 1932 of the same field are recorded. The yield data are presented in the data frame columns \(Y_{1}\) dan \(Y_{1}\). Assuming that the data in immer follows the normal distribution, find the 95% confidence interval estimate of the difference between the mean barley yields between years 1931 and 1932.
Estimate the difference between the means of matched samples using your textbook formula.
Find confidence interval for mean of difference is \(d-E \le \mu_{d} \le d+E\), where \(E = t_{\frac{\alpha}{2}, n-1} \frac{S_{d}}{sqrt(n)}\)

Y1 = immer$Y1
Y2 = immer$Y2
diff = Y1-Y2

n = length(diff)
d = mean(diff)
sdiff = sqrt((1/(n-1))*sum(diff^2))

alpha = 1-0.95 ; alpha
## [1] 0.05
t = qt(1-(alpha/2), df = n-1)
E =  c(-t,t) * (sdiff/sqrt(n)); E # margin of error
## [1] -11.50642  11.50642
Interval = round(d+E, digits = 3) ; Interval
## [1]  4.407 27.420

Didapatkan confidence interval atau selang kepercayaan dari rata-rata barley yields diantara 4.403 dan 27.420. \(4.403 \le \mu_{d} \le27.420\).

4 Comparison Proportions

4.1 Exercise 19

In the built-in data set named quine, children from an Australian town is classified by ethnic background, gender, age, learning status and the number of days absent from school. In effect, the data frame column Eth indicates whether the student is Aboriginal or Not (“A” or “N”), and the column Sex indicates Male or Female (“M” or “F”). Assuming that the data in quine follows the normal distribution, find the 95% confidence interval estimate of the difference between the female proportion of Aboriginal students and the female proportion of Non-Aboriginal students, each within their own ethnic group.
Estimate the difference between two population proportions using your textbook formula.

table(quine$Eth, quine$Sex) 
##    
##      F  M
##   A 38 31
##   N 42 35
data = quine%>%
  count(Eth, Sex)
data = data.frame(
  "-" = c("A","N","Total"),
  "F" = c(38,42, 38+42),
  "M" = c(31,35, 31+35),
  "Total" = c(38+31, 42+35, 38+31+42+35)
)
data
##      X.  F  M Total
## 1     A 38 31    69
## 2     N 42 35    77
## 3 Total 80 66   146
# 1 for A, and 2 for N
phat1 = 38/69
n1 = 60
phat2 = 42/77
n2 = 77
alpha = .05
pe = phat1-phat2 # point estimate 
x = qnorm(0.975) * sqrt(((phat1*(1-phat1))/n1)+((phat1*(1-phat1))/n2))

CI = round(pe + c(-x,x), digits=4);CI 
## [1] -0.1626  0.1732

95% confidence interval atau selangkepercayaannya adalah [-16.26% , 17.32%] dengan tidak koreksi (Yates’ continuity correction).

prop.test(table(quine$Eth, quine$Sex), correct=F) # tidak dikoreksi Yates' continuity correction. CI [ -15.64 , 16.69]
## 
##  2-sample test for equality of proportions without continuity correction
## 
## data:  table(quine$Eth, quine$Sex)
## X-squared = 0.0040803, df = 1, p-value = 0.9491
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.1564218  0.1669620
## sample estimates:
##    prop 1    prop 2 
## 0.5507246 0.5454545