T-tests -- Definition and Types of tests

Sameer Mathur

T-test

Student t-test is a statistical test which is widely used to compare the mean of two groups of samples. It is used to evaluate whether the means of two sets of data are significantly different from each other.

Types of t-test

The one-sample t-test, used to compare the mean of a population with a theoretical value.
The unpaired two sample t-test, used to compare the mean of two independent samples.
The paired t-test, used to compare the means between two related groups of samples.

One-sample t-test

One-sample t-test is used to compare the mean of a population to a specified theoretical mean \( \mu \).

One-sample t-test -- Formula

Let \( X \) represents a set of values with size \( n \), with mean \( m \) and with standard deviation \( S \). The comparison of the observed mean \( (m) \) of the population to a theoretical value \( \mu \) is performed with the formula below :

\[ t = \frac{m - \mu}{\frac{s}{\sqrt{n}}} \]

The degrees of freedom \( (df) \) used in this test are :

\[ df = n-1 \]

One-sample t-test -- Assumptions

The t-test can be used only when the data are normally distributed.

Independent (Unpaired) two sample t-test

Independent (or unpaired two sample) t-test is used to compare the means of two unrelated groups of samples.

Independent (Unpaired) two sample t-test -- Example

Suppose we have a cohort of 100 individuals (50 women and 50 men).

Suppose the question is to test whether the average weight of women is significantly different from that of men?

In this case, we have two independents groups of samples and the unpaired t-test can be used to test whether the means are different.

Independent t-test formula

Let \( A \) and \( B \) represent the two groups to compare.
Let \( m_A \) and \( m_B \) represent the means of groups \( A \) and \( B \), respectively.
Let \( n_A \) and \( n_B \) represent the sizes of group \( A \) and \( B \), respectively.

slide next….

Independent t-test formula

The t-test statistic value to test whether the means are different can be calculated as follow:

\[ t = \frac{m_A - m_B}{\sqrt{\frac{S^2}{n_A} + \frac{S^2}{n_B}}} \]

\( S^2 \) is an estimator of the common variance of the two samples. It can be calculated as follow:

\[ S^2 = \frac{\sum(x-m_A)^2 + \sum(x-m_B)^2}{n_A + n_B - 2} \]

Independent t-test - Assumptions

The test can be used only when the two groups of samples (A and B) being compared follow a bivariate normal distribution with equal variances.

Paired sample t-test

Paired Student's t-test is used to compare the means of two related samples. That is when you have two values (pair of values) for the same samples.

Paired sample t-test -- Example

For example, \( 20 \) mice received a treatment \( X \) for \( 3 \) months.

The question is to test whether the treatment \( X \) has an impact on the weight of the mice at the end of the \( 3 \) months of treatment.

The weight of the \( 20 \) mice has been measured before and after the treatment. This gives us \( 20 \) sets of values before treatment and \( 20 \) sets of values after treatment from measuring twice the weight of the same mice.

Paired sample t-test -- Example

In this case, the paired t-test can be used as the two sets of values being compared are related.

We have a pair of values for each mouse (one before and the other after treatment).

Paired t-test -- Formula

To compare the means of the two paired sets of data, the differences between all pairs must be, first, calculated.

Paired t-test is based on the differences between the values of each pair, that is one subtracted from the other. In the formula for a paired t-test, this difference is notated as \( d \). Formula of the paired t test is the ratio of the sum of the differences of each pair to the square root of n times the sum of the differences squared minus the sum of the squared differences, all over \( n-1 \).

\[ t = \frac{\sum(d)}{\sqrt{\frac{n(\sum d^2) - (\sum d^2)}{n-1}}} \]

where, \( \sum d = \) Sum of the differences

Paired t-test -- Assumptions

The test can be used only when the difference \( d \) is normally distributed.