- Meaning and Uses
- Use when comparing two data sets
- Effects of sample size on P-values
If the red line on the previous slide represents \(\alpha=0.05\), a common cutoff for accepting or rejecting the null hypothesis, the area under the curve to the right represents the probability that the null hypothesis is true and the data happened randomly
\[ t=(\overline{x}-\mu)/(s/\sqrt{n}) \] where \(\overline{x}\)=sample mean, \(\mu\)=hypothesized mean, \(s\)=standard deviation and \(n\)=sample size. The t value then must be compared to a table to calculate the p value.
-Instead of a table it can also be calculated using R, the code is:
pt(q, df, lower.tail = TRUE)
where the q =
the t value, df=
the degrees of freedom, and lower.tail = TRUE/FALSE
refers to if it is testing for the sample mean being greater or less than the hypothesized mean. To simply find the p-value referring to any difference between the two means, positive or negative, set lower.tail=FALSE
and multiply by 2.
P values are useful because often graphs do not let us see if a difference is significant For example, if we look at the data set USArrests
, and arbitrarily split the states into two groups alphabetically (Alabama-Missouri and Montana-Wyoming), is there a difference in the number of murders between the two groups?
First we can examine visually with box plots
It is not visually obvious from the previous graph if there is a difference between the two data sets, which is why finding the p-value is helpful and necessary. The means are different (marked by a vertical black line) so it is tempting to assume there is a significant difference between the two data sets.
To manually calculate the t-test value, we revisit the equation \(t=(\overline{x}-\mu)/(s/\sqrt{n})\).
\(\overline{x} =\) mean of AL-MO \(= 8.62\)
\(\mu =\) mean of MT-WY \(= 6.956\)
\(s =\) standard deviation of MT-WY \(= 4.051345\)
\[ n = 25 \\ (8.62-6.956)/(4.051345/\sqrt{25}) = 2.053639 \]
## [1] 0.05106088
Using code we see that the corresponding p- value is 0.051, meaning there is a 5.1% chance that the difference in the results occurred naturally. This is higher than the common p-value cutoff of \(\alpha = 0.05\), so the null hypothesis would not be rejected. This makes sense because the two groups were chosen with methodology that had nothing to do with any variables that may actually have a correlation with different murder rates. If you just looked at the difference in the means though, you may be tempted to conclude there is actually a meaningful difference between the means of the two data sets.
Something to keep in mind is that the size of a data set affects the p-value. If you have two data sets that you are comparing to a hypothesized mean of 10, that have the same mean and standard deviation, that have different number of samples, they will return two different p-values. In general, you can be more confident in results (there are lower p-values) with larger data sets.
If we create and plot two data sets that have the same mean and standard distribution, visually they look like they would have the same p-value when compared with a hypothesized mean of 10.
But, when you compare the different calculated p-values, the p-value of the data set with n=12 is 0.111173 and the p-value of the data set with n=120 is 2.446288^{-7}. The first value would rarely be considered significant and the second would very often be considered significant. The take away from this is that the p-value of a data set may be decreased with more data points.