Geordie Ellis
2026-06-10
The concept of the P-value was introduced by Karl Pearson in 1900 although Sir Ronald A. Fisher was the man who enabled calculation of the P-value.
The P-value was not originally meant as a final test for significance, but rather a preliminary test used to determine whether further testing should be performed.
It is important to contextualize P-values because sample size and other factors can skew a P-value to less than 0.05 without the data really suggesting anything significant.
Some tests or approaches that can be used alongside P-values to evaluate how or if the data in question might be significant are confidence intervals and Bayesian methods.
Statistical significance, which is tested by the P-value is not synonymous with significance relevant to any and all subject matters. Once statistical significance is found, it is critical to evaluate what that means in context.
A P-value of less than 0.05 is going to be declared statistically significant with the opposite being true if it is greater than 0.05.
The formula for a test statistic is as follows:
t = ( - _0)/(s/
The test statistic is used to calculate the P-value.
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.5.3
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
##
## One Sample t-test
##
## data: x
## t = 10.133, df = 8, p-value = 7.691e-06
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 6.299532 10.011579
## sample estimates:
## mean of x
## 8.155556
This is a histogram representation of the x-values that we will find the t-statistic of to find a p-value in the next
## [1] -1.688803
Now with this test statistic of -1.688803, with 8 degrees of freedom, the p-value can be found to be less than 0.05 because -1.688803 is less than 2.3.
The above plot shows the different quartiles that the values produced by rnorm lie in including the range and median.
## [1] -61.69421
With a test statistic of -46.08574 it can be confidently concluded that the P-value is greater than 0.05 and therefore, this is not statistically sigificant.
Below is the code for the previous plot:
library(ggpubr) null <- rnorm(100, mean = 0, sd = 1) null_df <- data.frame(null)
p <- ggplot(data=null_df, aes(y=null_df[, 1])) + geom_boxplot(fill=‘#A4A4A4’) p