We know that when very large sample sizes are observed that they tend to be distributed normally (with a normal distribution). The predictability of the proportions of the normal distribution make the Z-test a powerful one. (Again, a test with statistical “power” means that the algorithm has a good ability to discern differences or relationships between groups.) The problem with the Z-test is what it requires to work, especially in its requirement that we know the parametric standard deviation () for it to work.


The t test

In 1908 a statistician working for the Guinness Brewing Company discovered a test for small sample sizes that were approximately normally distributed. This test and distribution has come to be called Student’s t. Its algorithm is


\(t=\frac{\overline{Y}-\mu}{s}\)


It’s got the basic layout of Z—it compares an observation or sample mean to a parametric or comparison mean divided by the dispersion of the population—but here the parametric is estimated by the sample’s standard deviation. This has the result of giving a normalish-looking distribution, but the proportions are a little different in the tails and the shape of the distribution changes as the sample size increases. At sufficiently large samples (n > 1000), the distribution of t very closely approximates the shape of a normal distribution.

Another difference with the t distribution compared to Z is that we have to keep track of the degrees of freedom. Here for a one-sample t test, df = n -1.


This test has assumptions:


Using the t test

Here is an example: Imagine I observed a sample of intertidal crabs, and I want to know if their internal temperature at low tide is different than the temperature of the air at that time (24.3 °C). Let \(\alpha\) = 0.05.


\(H_0\): The crabs’ temperature is the same as the air (\(\overline{Y}\) = 23.3 °C).
\(H_a\): The crabs temperature is different than the air (\(\overline{Y}\) ≠ 23.3 °C).

# Copy this to make an array of the data
crabs <- c(25.8, 24.6, 26.1, 22.9, 25.1, 27.3, 24.0, 24.5, 23.9, 26.2, 24.3, 24.6, 23.3, 25.5, 28.1, 24.8, 23.5, 26.3, 25.4, 25.5, 23.9, 27.0, 24.8, 22.9, 25.4)


# Do the test and specify that the population mean is 24.3
t.test(crabs, mu = 24.3)
## 
##  One Sample t-test
## 
## data:  crabs
## t = 2.7128, df = 24, p-value = 0.01215
## alternative hypothesis: true mean is not equal to 24.3
## 95 percent confidence interval:
##  24.47413 25.58187
## sample estimates:
## mean of x 
##    25.028

So, we calculated a value of t = 2.7128 at 24 degrees of freedom. The two-tailed critical value at df = 24 (from Statistical Table C) is 2.06. Because our calculated value for t exceeds this critical value, we can reject the null hypothesis. The output shows that P = 0.01215.

Notice that the output includes the 95% CI for the population mean by default, and that 24.3 is not in that range.


So our sample mean = 25.028 °C. What if we wanted to know if the crabs were hotter than the air temperature, not just different? Then we’d do this:

\(H_0\): The crabs’ temperature is lower than or the same as the air (\(\overline{Y}\le\) 23.3 °C).
\(H_a\): The crabs temperature is different than the air (\(\overline{Y} >\) 23.3 °C).

# Use the crab vector from before

# Do the test and specify that the population mean is 24.3
# Also specify the tailedness with the argument alternative = "greater" or "less". You can specify just the initial letter.

t.test(crabs, mu = 24.3, alternative = "g")
## 
##  One Sample t-test
## 
## data:  crabs
## t = 2.7128, df = 24, p-value = 0.006073
## alternative hypothesis: true mean is greater than 24.3
## 95 percent confidence interval:
##  24.56887      Inf
## sample estimates:
## mean of x 
##    25.028

Why so much more significant? We’ve put our whole area of rejection into the right tail instead of splitting it up between the two. The critical value from the table is way lower, too: \(t_{\alpha\left(1\right),\:24} = 1.71\).


That’s it!

Next time we’ll work through the process for looking for differences between two sample means.