Two sample student t test

Abstract

A t-test is any stasticial hupothesis in which the test_statistics follows a student distribution under the null hypothesis. It is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known. The t-test’s most common application is to test whether the means of two populations are different. A two smaple location test of the null hypothesis such that the of two populations are equal. All such tests are usually called student t-test.

Student’s t–test for two samples when you have one measurement variable and one nominal variable, and the nominal variable has only two values. It tests whether the means of the measurement variable are different in the two groups.

Uses are to “Test for Normality and Nonparametric Tests” The samples for the two-sample t-test should come from a distribution that’s close to normal. This condition is called the assumption of normality. Signs that your data does not come from a normal distribution include skewness or unusually fat_tails. Use the two-sample t–test when you have one normal vairable and one measurement vairable, and you want to compare the mean values of the measurement variable. The nominal variable must have only two values, such as “male” and “female” or “treated” and “untreated.”

There are several statistical tests that use the t-distribution and can be called a t–test. One of the most common is Student’s t–test for two samples. Other t–tests include the one sample t test, which compares a sample mean to a theoretical mean, and the paired t test.

Student’s t–test for two samples is mathematically identical to a one way anova with two categories; because comparing the means of two samples is such a common experimental design, and because the t–test is familiar to many more people than anova, I treat the two-sample t–test separately. The statistical null hypothesis is that the means of the measurement variable are equal for the two categories.

How test works

The test statistic, ts, is calculated using a formula that has the difference between the means in the numerator; this makes ts get larger as the means get further apart. The denominator is the standard error of the difference in the means, which gets smaller as the sample variances decrease or the sample sizes increase. Thus ts gets larger as the means get farther apart, the variances get smaller, or the sample sizes increase.

You calculate the probability of getting the observed ts value under the null hypothesis using the t-distribution. The shape of the t-distribution, and thus the probability of getting a particular ts value, depends on the number of degrees of freedom. The degrees of freedom for a t–test is the total number of observations in the groups minus 2, or n1+n2−2.

Introduction

here are several statistical tests that use the t-distribution and can be called a t–test. One of the most common is Student’s t–test for two samples. Other t–tests include the , which compares a sample mean to a theoretical mean, and the paired t test.

Student’s t–test for two samples is mathematically identical to a one way anova with two categories; because comparing the means of two samples is such a common experimental design, and because the t–test is familiar to many more people than anova, I treat the two-sample t–test separately.

Formulas

Where X̄1 is mean of first sample, X̄2 is mean of second sample, μ1 is the mean of first population, μ2 is the mean of second population, s1 is the standard deviation of first sample, s2 is the standard deviation of second sample, n1 is the size of the first sample, n2 is the size of the second sample.This is used for two independent samples.

Degrees of freedom, df = n1 + n2 – 2

Difference in population means = Difference in sample means +/- T*standard error

In above formula, the standard error is the square root term.

![

In case, the two populations id standard deviations are equal, the formula termed as pooled t-statistics is used based on the usage of pooled standard deviations of the two samples. The following is the formula for the pooled t-statistics. ](images/image-1895682331.png)

Assuming unequal variances, the test statistic is calculated as:

Real life applications

Studying Techniques

A professor wants to know if two studying techniques lead to different mean exam scores.

To test this, he assigns 30 students to use one studying technique and 30 students to use a different studying technique in preparation for an exam. He then has each student take the same exam. He can use an independent two sample t-test to determine if the mean is different between the two groups.

Weight Loss

A dietician wants to know if two different diets lead to different mean weight loss amounts.

To test this, she assigns 20 subjects to use diet A for one month and 20 subjects to use diet B for one month. She then measures the total weight loss of each subject at the end of the month. She can use an independent two sample t-test to determine if the mean weight loss is different between the two groups.

Comparision of two groups

To compare two unmatched groups, use an unpaired t-test. The numbers in the two groups may be different, but even if they are the same a paired test cannot be done unless pairing is justified.

Return to the data shown in Table 22.1, and assume that the experiment compared two separate groups of young rats, one group fed with added raw peanuts, the other with added roasted peanuts. No pairing is done. To do the required calculations, first calculate the mean and standard deviation for each group. These values for the means are: Raw 57.9 g, Roasted 55.9 g, and for the standard deviations are: Raw 5.59 g and Roasted 4.75 g.

The t ratio then becomes the difference between the means divided by the measure of variability. The variability is, however, derived from two separate variabilities, one for each group. The variability of the difference between two groups is greater than either one alone, because we have to allow for one mean being unusually low and the other unusually high

Data Analysis Tools

These are used in excel operations. To use this tool press Ctrl-m and select T Tests and Non-parametric Equivalents from the menu (or from the Misc tab if using the Multipage interface). A dialog box will appear. Enter B3:C18 in the Input Range 1 field (or B3:B18 in Input Range 1 and C3:C18 in Input Range 2) and choose the Column headings included with the data, Paired Samples, and T Test options.

We can also use the T Test and Non-parametric Equivalents data analysis tool in the Real Statistics Resource Pack to get the same result.

Missing Data

These are used in excel operations. The input data for the paired-sample t-test can have missing data, indicated by empty cells or cells with non-numeric data. Such cells will be ignored in the analysis.

In this example, there is missing data for subjects 5, 7, and 10. The analysis is rerun with the data for these people removed. Note that some of the formulas have been changed to account for the missing data. E.g. when there is no missing data, cell H27 can contain the simple formula =AVERAGE(B25:B39), but since there is missing data the following formula is used instead:

=SUMPRODUCT(ISNUMBER(B25:B39)*ISNUMBER(C25:C39),B25:B39)/G27

If you have missing data you can change the data values and even fill in the missing data with numeric values and the resulting analysis will be correct. If, however, the input does not contain any missing data, you can change any of the data values and still get a valid analysis but if you change a numeric value to a non-numeric value then the analysis will not be correct and you will need to rerun the data analysis tool to get the correct results. Note that the mean differences are the same, but the standard deviation for the paired sample case is lower, which results in a higher t-stat and a lower p-value. This is generally true.

Problems

1.Problem

In fall 2004, students in the 2 p.m. section of my Biological Data Analysis class had an average height of 66.6 inches, while the average height in the 5 p.m. section was 64.6 inches. Are the average heights of the two sections significantly different? Here are the data:

2 p.m. 5 p.m.
69 68
70 62
66 67
63 68
68 69
70 67
69 61
67 59
62 62
63 61
76 69
59 66
62 62
62 62
75 61
62 70
72  
63  

There is one measurement variable, height, and one nominal variable, class section. The null hypothesis is that the mean heights in the two sections are the same. The results of the t–test (t=1.29, 32 d.f., P=0.21) do not reject the null hypothesis.

2 problem.

One way to measure a person’s fitness is to measure their body fat percentage. Average body fat percentages vary by age, but according to some guidelines, the normal range for men is 15-20% body fat, and the normal range for women is 20-25% body fat.

Our sample data is from a group of men and women who did workouts at a gym three times a week for a year. Then, their trainer measured the body fat. The table below shows the data.

Table 1: Body fat percentage data grouped by gender

Group Body Fat Percentages
Men 13.3
19.0 18.0
15.0 1.0
Women
22.0
26.0 12.0

You can clearly see some overlap in the body fat measurements for the men and women in our sample, but also some differences. Just by looking at the data, it’s hard to draw any solid conclusions about whether the underlying populations of men and women at the gym have the same mean body fat. That is the value of statistical tests – they provide a common, statistically valid way to make decisions, so that everyone makes the same decision on the same set of data values.

Checking the data

Let’s start by answering: Is the two-sample t-test an appropriate method to evaluate the difference in body fat between men and women?

  • The data values are independent. The body fat for any one person does not depend on the body fat for another person.

  • We assume the people measured represent a simple random sample from the population of members of the gym.

  • We assume the data are normally distributed, and we can check this assumption.

  • The data values are body fat measurements. The measurements are continuous.

  • We assume the variances for men and women are equal, and we can check this assumption.

Before jumping into analysis, we should always take a quick look at the data. The figure below shows histograms and summary statistics for the men and women.

Histogram and summary statistics for the body fat data

Histogram and summary statistics for the body fat data

Figure 1: Histogram and summary statistics for the body fat data

The two histograms are on the same scale. From a quick look, we can see that there are no very unusual points, or outliers. The data look roughly bell-shaped, so our initial idea of a normal distribution seems reasonable.

Examining the summary statistics, we see that the standard deviations are similar. This supports the idea of equal variances. We can also check this using a test for variances.

Based on these observations, the two-sample t-test appears to be an appropriate method to test for a difference in means.

How to perform the two-sample t-test

For each group, we need the average, standard deviation and sample size. These are shown in the table below.

Table 2: Average, standard deviation and sample size statistics grouped by gender

Group Sample Size (n) Average (X-bar) Standard deviation (s)
Women 10 22.29 5.32
Men 13 14.95 6.84

Without doing any testing, we can see that the averages for men and women in our samples are not the same. But how different are they? Are the averages “close enough” for us to conclude that mean body fat is the same for the larger population of men and women at the gym? Or are the averages too different for us to make this conclusion?

We’ll further explain the principles underlying the two sample t-test in the statistical details section below, but let’s first proceed through the steps from beginning to end. We start by calculating our test statistic. This calculation begins with finding the difference between the two averages:

22.29−14.95=7.3422.29−14.95=7.34

This difference in our samples estimates the difference between the population means for the two groups.

Next, we calculate the pooled standard deviation. This builds a combined estimate of the overall standard deviation. The estimate adjusts for different group sizes. First, we calculate the pooled variance:

s2p=((n1−1)s21)+((n2−1)s22)n1+n2−2sp2=((n1−1)s12)+((n2−1)s22)n1+n2−2

s2p=((10−1)5.322)+((13−1)6.842)(10+13−2)sp2=((10−1)5.322)+((13−1)6.842)(10+13−2)

=(9×28.30)+(12×46.82)21=(9×28.30)+(12×46.82)21

=(254.7+561.85)21=(254.7+561.85)21

=816.5521=38.88=816.5521=38.88

 

Next, we take the square root of the pooled variance to get the pooled standard deviation. This is:

√38.88=6.2438.88=6.24

We now have all the pieces for our test statistic. We have the difference of the averages, the pooled standard deviation and the sample sizes.  We calculate our test statistic as follows:

t=difference of group averagesstandard error of difference=7.34(6.24×√(1/10+1/13))=7.342.62=2.80 df=n1+n2−2=10+13−2=21

3 Problem.

Consider the gain in weight of 19 female rats between 28 and 84 days after birth. 12 were fed on a high protein diet and 7 on a low protein diet.

 

High protein Low protein
134 70
146 118
104 101
119 85
124 107
161 132
107 94
83  
113  
129  
97  
123  

 

To analyse these data in StatsDirect first prepare them in two workbook columns and label these columns appropriately. Alternatively, open the test workbook using the file open function of the file menu. Then select the unpaired t test from the parametric methods section of the analysis menu. Select the columns marked “High protein” and “Low protein” when prompted for data.

 

For this example:

 

Unpaired t test

Mean of High Protein = 120 (n = 12)

Mean of Low Protein = 101 (n = 7)

 

Assuming equal variances

Combined standard error = 10.045276

df = 17

t = 1.891436

One sided P = 0.0379

Two sided P = 0.0757

 

95% confidence interval for difference between means = -2.193679 to 40.193679

 

Power (for 5% significance) = 82.25%

 

Assuming unequal variances

Combined standard error = 9.943999

df = 13.081702

t(d) = 1.9107

One sided P = 0.0391

Two sided P = 0.0782

 

95% confidence interval for difference between means = -1.980004 to 39.980004

 

Power (for 5% significance) = 40.39%

 

Comparison of variances

Two sided F test is not significant

No need to assume unequal variances

 

Thus we have a difference that is not quite significant at the 5% level. The most important information is, however, conveyed by the confidence interval. The 95% CI includes zero therefore we can not be confident (at the 95% level) that these data show any difference in weight gain. As most of the interval is toward weight gain and as the test result is in the grey “suggestive” 5%-10% zone we have good evidence for repeating this experiment with larger numbers. Bigger samples will probably shrink the range of uncertainty so that the confidence interval contracts to a narrower band that excludes zero.

N.B. We did not consider a one sided P value here because we could not be absolutely certain that the rats would all benefit from a high protein diet in comparison with those on a low protein diet.

Code for two sample t test in R

t.test(mpg~am, mu=0, conf.level = 0.95, alternative = "two.sided")

data("mtcars")
attach(mtcars)
boxplot(mpg~am)
t.test(mpg~am, mu=0, conf.level = 0.95, alternative = "two.sided", var.equal=F)

Conclusion

The assumptions underlying the two-sample student t-test should not be pre-tested.We have seen many formulas which used to implement the functions given.

It gives the results of a two-sample t test, compare the t level for the given to make a conclusion in context about the difference between two means.

If your calculated t value is greater than the critical T-value from the table, you can conclude that the difference between the means for the two groups is significantly different. We reject the null hypothesis and conclude that the alternative hypothesis is correct.

References

https://en.wikipedia.org/wiki/Student%27s_t-test