Sameer Mathur
The t-test assesses whether the means of two groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means of two groups.
In another words, a t-test compares the mean of one sample against the mean of another sample (or against a specific value such as 0). The important point is that it compares the mean for exactly two sets of data. A t-test compares the mean of one sample against the mean of another sample. The important point is that, it compares the mean for exactly two sets of data.
We use UScrime dataset distributed with the MASS package. It contains information about the effect of punishment regimes on cime rates in 47 US states in 1960.
library(MASS)
head(UScrime)
M So Ed Po1 Po2 LF M.F Pop NW U1 U2 GDP Ineq Prob Time
1 151 1 91 58 56 510 950 33 301 108 41 394 261 0.084602 26.2011
2 143 0 113 103 95 583 1012 13 102 96 36 557 194 0.029599 25.2999
3 142 1 89 45 44 533 969 18 219 94 33 318 250 0.083401 24.3006
4 136 0 121 149 141 577 994 157 80 102 39 673 167 0.015801 29.9012
5 141 0 121 109 101 591 985 18 30 91 20 578 174 0.041399 21.2998
6 121 0 110 118 115 547 964 25 44 84 29 689 126 0.034201 20.9995
y
1 791
2 1635
3 578
4 1969
5 1234
6 682
For data description column please visit Data Description.
A two-group independent t-test can be used to test the hypothesis that the two population means are equal. The default test assumes unequal variance and applies the Welsh degrees-of-freedom modification. By default, a two-tailed alternative is assumed that is, the means differ but the direction isn't specified.
library(MASS)
t.test(Prob ~ So, data=UScrime)
Welch Two Sample t-test
data: Prob by So
t = -3.8954, df = 24.925, p-value = 0.0006506
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.03852569 -0.01187439
sample estimates:
mean in group 0 mean in group 1
0.03851265 0.06371269
You can reject the hypothesis that the Southern states and non-Southern states have equal probabilities of imprisonment p<0.001.
A dependent t-test assumes that the difference between groups is normally distributed.
sapply(UScrime[c("U1","U2")], function(x)(c(mean=mean(x),sd=sd(x))))
U1 U2
mean 95.46809 33.97872
sd 18.02878 8.44545
In this case, the two groups are not independent.
with(UScrime, t.test(U1, U2, paired=TRUE))
Paired t-test
data: U1 and U2
t = 32.407, df = 46, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
57.67003 65.30870
sample estimates:
mean of the differences
61.48936
The mean difference 61.48936 is large enough to warrant rejection of the hypothesis that the mean unemployment rate for older and younger males is the same. In fact, the probability of obtaining a sample diffrence this large if the population means are equal is less than 2.2e-16.