DAT301 HW 3

2026-03-06

P-Values

In hypothesis testing, scientists often aim to support their hypothesis by rejecting a “null hypothesis” using the data they collect.
The null hypothesis typically assumes there is no significant relationship between the variables being tested, while the alternative hypothesis assumes that there is a relationship.
A P-value, or probability value, measures the probability of obtaining the values in a given dataset assuming that a null hypothesis is true: \[P = \small{\text{Probability of observing results }}|{\text{ Null hypothesis is true}}\]
If that probability is below a significance cutoff (e.g. 0.05 or 0.01), then there is generally sufficient evidence to support the alternative hypothesis.

Statistical Tests

There is no one equation to calculate a p-value. The calculation depends on the statistical test being conducted.
Some common statistical tests include:
- Chi-square for categorical data
- T-test for comparing means across 2 groups with normally distributed data
- ANOVA for comparing means across 3+ groups with normally distributed data

Usage Example

In this example, we will use a t-test, and the base R dataset ‘iris’.
For simplicity, we will compare petal length between two iris species: setosa and versicolor.
First, we can create filtered versions of the dataset with just our species of interest:

setosa = iris$Petal.Length[iris$Species == "setosa"]
versicolor = iris$Petal.Length[iris$Species == "versicolor"]

Iris Petal Length

A t-test asks us: are these two groups different?
In this case: do setosa and versicolor irises have different petal lengths?
In the graph below, we see that they have very different mean petal lengths. But can we prove it statistically?

Data Distribution

We can also view this using a histogram, and check if our data is normally distributed.

T-test

This is the t-test equation: \[t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}\]
It calculates how many standard errors apart the two sample means are.
The null hypothesis is that the mean lengths are the same, while the alternative is that the mean lengths are different.

T-test on Iris Petal Length Data

We can run a t-test in R using the following function:

t.test(setosa, versicolor)

## 
##  Welch Two Sample t-test
## 
## data:  setosa and versicolor
## t = -39.493, df = 62.14, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.939618 -2.656382
## sample estimates:
## mean of x mean of y 
##     1.462     4.260

T-test on Iris Petal Length Data

The calculated p-value is 2.2e-16, or 0.0000000000000002, much lower than our cutoff of 0.05.
So, this p-value means that the difference in petal length is statistically significant, because the chance of getting the values we have if the null hypothesis was true is 0.00000000000002%.

Visualizing the T-test

We can graph the T-test values to visualize how our test works:

Visualizing the T-test

Based on the graph, a t-statistic of 0 means that there is no difference between the two petal length means, and anything falling within the red range would have a p-value < 0.05.
Our t-statistic is -39.49, far past our cutoff of -2, which is why the p-value is so low and our test has very strong statistical significance.

References

Mangiafico, S.S. (2016). R Handbook: Hypothesis Testing and p-values. Rcompanion.org. https://rcompanion.org/handbook/D_01.html

Mangiafico, S.S. (2017). R Handbook: p-values and R-square Values for Models. Rcompanion.org. https://rcompanion.org/handbook/G_10.html

P-VALUES SIMPLIFIED. (n.d.). https://wmed.edu/sites/default/files/P-VALUES%20SIMPLIFIED.pdf

Sievert, C. (2019). Overview | Interactive web-based data visualization with R, plotly, and shiny. Plotly-R.com. https://plotly-r.com/overview