Uji Hipotesis \((\mu_1-\mu_2)\): Sampel Independen

Uji hipotesis mengenai perbedaan rata-rata dua populasi untuk sampel independen dapat menggunakan independent two-sample t test (uji t dua sampel independen).

Contoh Kasus dan Penyelesaian

(Sumber: Comparing two means in R)

I have some data comparing the length of the pelvis in 5 male macaques and 9 male gibbons. Do these species have the same pelvis length?

\(H_{0}:\) Macaque = Gibbon \((\mu_1=\mu_2)\).
\(H_1:\) Macaque ≠ Gibbon \((\mu_1 \neq \mu_2)\).

# Bring in the file from your drive. Insert the file path inside the quotes.

pelvis <- read.csv(url("https://raw.githubusercontent.com/nmccurtin/CSVfilesbiostats/master/pelvislength%20(2).csv"))

# Use the package called "lattice" to do a stacked histogram. 
# Click it to activate it in the "Packages" pane or use this function.

library(lattice)

histogram( ~ pelvis | species, data = pelvis, layout = c(1,2), col = "orange", breaks = seq(7, 15, by = 1), xlab = "Pelvis length (mm)")

# Do the test

t.test(pelvis ~ species, data = pelvis, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  pelvis by species
## t = 8.0414, df = 12, p-value = 3.566e-06
## alternative hypothesis: true difference in means between group Hylobates lar and group Macaca fascicularis   is not equal to 0
## 95 percent confidence interval:
##  3.091495 5.389393
## sample estimates:
##         mean in group Hylobates lar mean in group Macaca fascicularis   
##                            13.25444                             9.01400

In the result above :

  • t is the t-test statistic value (t = 8.0414),
  • df is the degrees of freedom (df= 12),
  • p-value is the significance level of the t-test (p-value = 3.566e-06 = 0.000003566).
  • conf.int is the confidence interval of the mean at 95% (conf.int = [3.091495, 5.389393]);
  • sample estimates is he mean value of the sample (mean = 13.25444, 9.01400).

So, it’s clear that we can reject the null hypothesis that these means come from the same statistical population, (t = 8.04, df = 12, P < 0.01).

Uji Hipotesis \((\mu_1-\mu_2)\): Sampel Berpasangan

Uji hipotesis mengenai perbedaan rata-rata dua populasi untuk sampel berpasangan dapat menggunakan paired two-sample t test (uji t dua sampel berpasangan).

Contoh Kasus dan Penyelesaian

(Sumber: Comparing two means in R)

Imagine that I am testing the effects of a Very Low Calorie Diet (VLCD) on a sample of young women in high school. My data are:

Before: 117.3 111.4 98.6 104.3 105.4 100.4 81.7 89.5 78.2
After: 83.3 85.9 75.8 82.9 82.3 77.7 62.7 69.0 63.9

Did the VLCD caused these subjects to lose weight (\(\alpha\) = 0.05)?

\(H_0:\) The VLCD caused these subjects to gain weight or stay the same \((A \ge B\) or \(\overline{d} \ge 0)\).
\(H_1:\): The treatment increased the desired outcome \((A<B\) or \(\overline{d}<0)\).

There are nine pairs of observations, so there are 9 – 1 = 8 degrees of freedom. The critical value for rejection is \(t_{0.05(1),8}=-1.86\). Why negative? Because we set up our differences as A – B differences. This means that if our calculated value for t from the data is more extreme than –1.86 we can reject the null hypothesis with P < 0.05.

# Make arrays of the observations
before <- c(117.3, 111.4, 98.6, 104.3, 105.4, 100.4, 81.7, 89.5, 78.2)
after <- c(83.3, 85.9, 75.8, 82.9, 82.3, 77.7, 62.7, 69.0, 63.9)

# Combine those arrays into a data frame
vlcd <- data.frame(before, after)

# Calculate the differences between each pair and insert a new column
vlcd$difference <-(vlcd$after - vlcd$before)   ## This makes lost weight negative numbers

# Inspect the differences to see if they appear to be normally-distributed
hist(vlcd$difference, right = FALSE, col = "skyblue", main ="", xlab = "After - Before Difference")

# Either one of these will give you the same result:

t.test(vlcd$after, vlcd$before, paired = TRUE, alternative = "l")   ## if you didn't calculate a difference
## 
##  Paired t-test
## 
## data:  vlcd$after and vlcd$before
## t = -12.74, df = 8, p-value = 6.787e-07
## alternative hypothesis: true mean difference is less than 0
## 95 percent confidence interval:
##       -Inf -19.29166
## sample estimates:
## mean difference 
##       -22.58889

Note that if we didn’t include the argument alternative = “less” then we’d get the two-tailed result.

So, we have a calculated t of -12.74 which throws us farther into the area of rejection than our \(t_{0.05(1),8}=-1.86\). We can conclude that the VLCD treatment cause these subjects to lose weight (P < 0.05).

Referensi