2024-04-04

T-Testing

Hypothesis testing is where we use math to validate our hypothesis, comparing our hypothesis to the mean of our sample data within a range of statistical significance.

In R this means using the t.test() function, and I will demonstrate how to do so in these slides.

Data Set “chickwts”

I will use a simple Base R data set called chickwts to demonstrate a T-Test in R.

Here is the head of the chickwts data set. In the full data set we have chicken weights based on six different “feed” options.

##   weight      feed
## 1    179 horsebean
## 2    160 horsebean
## 3    136 horsebean
## 4    227 horsebean
## 5    217 horsebean
## 6    168 horsebean

We can use visulizations in R to start

ggplotly() lets us toggle between feed options (try it)

Pretend we are using soybeans

But sunflower looks like a better option

Back to the T-Test, what is that?

This is the formula for a one sample t-test where we compare the mean of a sample to our hypothesized mean:

\[ t = \frac{\bar{x}-\mu}{\frac{\sigma}{\sqrt{n}}} \] where \(\bar{x}\) is the sample mean, \(\mu\) is the mean we want to test against, \(\sigma\) is the standard deviation, and n is the number of observations.

This is the relevant formula for the t test we will be performing.

How to set up t.test()

We need to put our parameters in the t.test() function.

We know the average weight of our own chickens, who eat soybeans, is about 246 units, saved as soy_ave. This is our mu, or our variable that we are testing on other samples.

Because we want to know if the other sample is greater than ours, we will pass “greater” in the alternative variable.

t.test(mu = soy_ave, alternative = “greater”)

We are looking for a p value under .05 or 5%, that is to say there is a greater than 95% chance our test passes on average.

Now lets use our t.test function

t_sunflower <- chickwts %>%
filter(feed == “sunflower”) %>%
select(weight) %>%
t.test(mu = soy_ave, alternative = “greater”)
print(t_sunflower)

## 
##  One Sample t-test
## 
## data:  .
## t = 5.8511, df = 11, p-value = 5.536e-05
## alternative hypothesis: true mean is greater than 246.4286
## 95 percent confidence interval:
##  303.5986      Inf
## sample estimates:
## mean of x 
##  328.9167

CONCLUSION

Our p value is 5.536e-05, that is to say very small. So our test “passed” in the sense that our alternative hypothesis IS greater than our mu, and if we want to grow larger chickens we should switch to sunflower as a feed option.

I hope this was a helpful.

Please note: This presentation is a homework assignment.

If you have any thoughts or feedback please contact me at varevill@asu.edu