P-Values: Importance and Consideration

  • Meaning and Uses
  • Use when comparing two data sets
  • Effects of sample size on P-values

Null Hypothesis

  • The null hypothesis represents the statement that there is NO relationship between the variables being studied.
  • When using p-values the goal is to accept or reject the null hypothesis
  • The p-values are never conclusive, they simply give a probability that must be interpreted

Definition

  • The p-value represents the probability that a result was just due to chance, and not because the two variables are related (probability that the results fall within the normal distribution of the null hypothesis)

If the red line on the previous slide represents \(\alpha=0.05\), a common cutoff for accepting or rejecting the null hypothesis, the area under the curve to the right represents the probability that the null hypothesis is true and the data happened randomly

Calculating P-values: t-test

  • One way to calculate the p-values is a t-test, which compares the mean of two groups.
  • To calculate by hand, the equation is:

\[ t=(\overline{x}-\mu)/(s/\sqrt{n}) \] where \(\overline{x}\)=sample mean, \(\mu\)=hypothesized mean, \(s\)=standard deviation and \(n\)=sample size. The t value then must be compared to a table to calculate the p value.

-Instead of a table it can also be calculated using R, the code is:

  pt(q, df, lower.tail = TRUE)

where the q = the t value, df= the degrees of freedom, and lower.tail = TRUE/FALSE refers to if it is testing for the sample mean being greater or less than the hypothesized mean. To simply find the p-value referring to any difference between the two means, positive or negative, set lower.tail=FALSE and multiply by 2.

Example

P values are useful because often graphs do not let us see if a difference is significant For example, if we look at the data set USArrests, and arbitrarily split the states into two groups alphabetically (Alabama-Missouri and Montana-Wyoming), is there a difference in the number of murders between the two groups?

First we can examine visually with box plots

The vertical lines represent the means of the two groups.

It is not visually obvious from the previous graph if there is a difference between the two data sets, which is why finding the p-value is helpful and necessary. The means are different (marked by a vertical black line) so it is tempting to assume there is a significant difference between the two data sets.

To manually calculate the t-test value, we revisit the equation \(t=(\overline{x}-\mu)/(s/\sqrt{n})\).

\(\overline{x} =\) mean of AL-MO \(= 8.62\)

\(\mu =\) mean of MT-WY \(= 6.956\)

\(s =\) standard deviation of MT-WY \(= 4.051345\)

\[ n = 25 \\ (8.62-6.956)/(4.051345/\sqrt{25}) = 2.053639 \]

## [1] 0.05106088

Using code we see that the corresponding p- value is 0.051, meaning there is a 5.1% chance that the difference in the results occurred naturally. This is higher than the common p-value cutoff of \(\alpha = 0.05\), so the null hypothesis would not be rejected. This makes sense because the two groups were chosen with methodology that had nothing to do with any variables that may actually have a correlation with different murder rates. If you just looked at the difference in the means though, you may be tempted to conclude there is actually a meaningful difference between the means of the two data sets.

How different aspects of data sets affect p-values

Something to keep in mind is that the size of a data set affects the p-value. If you have two data sets that you are comparing to a hypothesized mean of 10, that have the same mean and standard deviation, that have different number of samples, they will return two different p-values. In general, you can be more confident in results (there are lower p-values) with larger data sets.

If we create and plot two data sets that have the same mean and standard distribution, visually they look like they would have the same p-value when compared with a hypothesized mean of 10.

But, when you compare the different calculated p-values, the p-value of the data set with n=12 is 0.111173 and the p-value of the data set with n=120 is 2.446288^{-7}. The first value would rarely be considered significant and the second would very often be considered significant. The take away from this is that the p-value of a data set may be decreased with more data points.