T-tests

Sameer Mathur

T-test

The t-test assesses whether the means of two groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means of two groups.

In another words, a t-test compares the mean of one sample against the mean of another sample (or against a specific value such as 0).
The important point is that it compares the mean for exactly two sets of data. A t-test compares the mean of one sample against the mean of another sample. The important point is that, it compares the mean for exactly two sets of data.

UScrime dataset

We use UScrime dataset distributed with the MASS package. It contains information about the effect of punishment regimes on cime rates in 47 US states in 1960.

library(MASS)
head(UScrime)
    M So  Ed Po1 Po2  LF  M.F Pop  NW  U1 U2 GDP Ineq     Prob    Time
1 151  1  91  58  56 510  950  33 301 108 41 394  261 0.084602 26.2011
2 143  0 113 103  95 583 1012  13 102  96 36 557  194 0.029599 25.2999
3 142  1  89  45  44 533  969  18 219  94 33 318  250 0.083401 24.3006
4 136  0 121 149 141 577  994 157  80 102 39 673  167 0.015801 29.9012
5 141  0 121 109 101 591  985  18  30  91 20 578  174 0.041399 21.2998
6 121  0 110 118 115 547  964  25  44  84 29 689  126 0.034201 20.9995
     y
1  791
2 1635
3  578
4 1969
5 1234
6  682

For data description column please visit Data Description.

Independent t-test

A two-group independent t-test can be used to test the hypothesis that the two population means are equal.
The default test assumes unequal variance and applies the Welsh degrees-of-freedom modification. By default, a two-tailed alternative is assumed that is, the means differ but the direction isn't specified.

Independent t-test

library(MASS)
t.test(Prob ~ So, data=UScrime)

    Welch Two Sample t-test

data:  Prob by So
t = -3.8954, df = 24.925, p-value = 0.0006506
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.03852569 -0.01187439
sample estimates:
mean in group 0 mean in group 1 
     0.03851265      0.06371269 

You can reject the hypothesis that the Southern states and non-Southern states have equal probabilities of imprisonment p<0.001.

Dependent t-test

A dependent t-test assumes that the difference between groups is normally distributed.

sapply(UScrime[c("U1","U2")], function(x)(c(mean=mean(x),sd=sd(x))))
           U1       U2
mean 95.46809 33.97872
sd   18.02878  8.44545

Dependent t-test

In this case, the two groups are not independent.

with(UScrime, t.test(U1, U2, paired=TRUE))

    Paired t-test

data:  U1 and U2
t = 32.407, df = 46, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 57.67003 65.30870
sample estimates:
mean of the differences 
               61.48936 

The mean difference 61.48936 is large enough to warrant rejection of the hypothesis that the mean unemployment rate for older and younger males is the same. In fact, the probability of obtaining a sample diffrence this large if the population means are equal is less than 2.2e-16.