Uji hipotesis mengenai perbedaan rata-rata dua populasi untuk sampel independen dapat menggunakan independent two-sample t test (uji t dua sampel independen).
(Sumber: Comparing two means in R)
I have some data comparing the length of the pelvis in 5 male macaques and 9 male gibbons. Do these species have the same pelvis length?
\(H_{0}:\) Macaque = Gibbon \((\mu_1=\mu_2)\).
\(H_1:\) Macaque ≠ Gibbon \((\mu_1 \neq \mu_2)\).
# Bring in the file from your drive. Insert the file path inside the quotes.
pelvis <- read.csv(url("https://raw.githubusercontent.com/nmccurtin/CSVfilesbiostats/master/pelvislength%20(2).csv"))
# Use the package called "lattice" to do a stacked histogram.
# Click it to activate it in the "Packages" pane or use this function.
library(lattice)
histogram( ~ pelvis | species, data = pelvis, layout = c(1,2), col = "orange", breaks = seq(7, 15, by = 1), xlab = "Pelvis length (mm)")
# Do the test
t.test(pelvis ~ species, data = pelvis, var.equal = TRUE)
##
## Two Sample t-test
##
## data: pelvis by species
## t = 8.0414, df = 12, p-value = 3.566e-06
## alternative hypothesis: true difference in means between group Hylobates lar and group Macaca fascicularis is not equal to 0
## 95 percent confidence interval:
## 3.091495 5.389393
## sample estimates:
## mean in group Hylobates lar mean in group Macaca fascicularis
## 13.25444 9.01400
In the result above :
So, it’s clear that we can reject the null hypothesis that these means come from the same statistical population, (t = 8.04, df = 12, P < 0.01).
Uji hipotesis mengenai perbedaan rata-rata dua populasi untuk sampel berpasangan dapat menggunakan paired two-sample t test (uji t dua sampel berpasangan).
(Sumber: Comparing two means in R)
Imagine that I am testing the effects of a Very Low Calorie Diet (VLCD) on a sample of young women in high school. My data are:
Before: | 117.3 | 111.4 | 98.6 | 104.3 | 105.4 | 100.4 | 81.7 | 89.5 | 78.2 |
After: | 83.3 | 85.9 | 75.8 | 82.9 | 82.3 | 77.7 | 62.7 | 69.0 | 63.9 |
Did the VLCD caused these subjects to lose weight (\(\alpha\) = 0.05)?
\(H_0:\) The VLCD caused these
subjects to gain weight or stay the same \((A
\ge B\) or \(\overline{d} \ge
0)\).
\(H_1:\): The treatment increased the
desired outcome \((A<B\) or \(\overline{d}<0)\).
There are nine pairs of observations, so there are 9 – 1 = 8 degrees of freedom. The critical value for rejection is \(t_{0.05(1),8}=-1.86\). Why negative? Because we set up our differences as A – B differences. This means that if our calculated value for t from the data is more extreme than –1.86 we can reject the null hypothesis with P < 0.05.
# Make arrays of the observations
before <- c(117.3, 111.4, 98.6, 104.3, 105.4, 100.4, 81.7, 89.5, 78.2)
after <- c(83.3, 85.9, 75.8, 82.9, 82.3, 77.7, 62.7, 69.0, 63.9)
# Combine those arrays into a data frame
vlcd <- data.frame(before, after)
# Calculate the differences between each pair and insert a new column
vlcd$difference <-(vlcd$after - vlcd$before) ## This makes lost weight negative numbers
# Inspect the differences to see if they appear to be normally-distributed
hist(vlcd$difference, right = FALSE, col = "skyblue", main ="", xlab = "After - Before Difference")
# Either one of these will give you the same result:
t.test(vlcd$after, vlcd$before, paired = TRUE, alternative = "l") ## if you didn't calculate a difference
##
## Paired t-test
##
## data: vlcd$after and vlcd$before
## t = -12.74, df = 8, p-value = 6.787e-07
## alternative hypothesis: true mean difference is less than 0
## 95 percent confidence interval:
## -Inf -19.29166
## sample estimates:
## mean difference
## -22.58889
Note that if we didn’t include the argument
alternative = “less”
then we’d get the two-tailed
result.
So, we have a calculated t of -12.74 which throws us farther into the area of rejection than our \(t_{0.05(1),8}=-1.86\). We can conclude that the VLCD treatment cause these subjects to lose weight (P < 0.05).