HW3: P-values

Geordie Ellis

2026-06-10

Intro and lesser-known P-value facts

The concept of the P-value was introduced by Karl Pearson in 1900 although Sir Ronald A. Fisher was the man who enabled calculation of the P-value.
The P-value was not originally meant as a final test for significance, but rather a preliminary test used to determine whether further testing should be performed.
It is important to contextualize P-values because sample size and other factors can skew a P-value to less than 0.05 without the data really suggesting anything significant.
Some tests or approaches that can be used alongside P-values to evaluate how or if the data in question might be significant are confidence intervals and Bayesian methods.
Statistical significance, which is tested by the P-value is not synonymous with significance relevant to any and all subject matters. Once statistical significance is found, it is critical to evaluate what that means in context.

First slide with Latex

A P-value of less than 0.05 is going to be declared statistically significant with the opposite being true if it is greater than 0.05.

The formula for a test statistic is as follows:

t = ( - _0)/(s/

The test statistic is used to calculate the P-value.

Slide with Plotly

## Loading required package: ggplot2

## Warning: package 'ggplot2' was built under R version 4.5.3

## 
## Attaching package: 'plotly'

## The following object is masked from 'package:ggplot2':
## 
##     last_plot

## The following object is masked from 'package:stats':
## 
##     filter

## The following object is masked from 'package:graphics':
## 
##     layout

## 
##  One Sample t-test
## 
## data:  x
## t = 10.133, df = 8, p-value = 7.691e-06
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##   6.299532 10.011579
## sample estimates:
## mean of x 
##  8.155556

This is a histogram representation of the x-values that we will find the t-statistic of to find a p-value in the next

Second slide with Latex

## [1] -1.688803

Now with this test statistic of -1.688803, with 8 degrees of freedom, the p-value can be found to be less than 0.05 because -1.688803 is less than 2.3.

First slide with ggplot

The above plot shows the different quartiles that the values produced by rnorm lie in including the range and median.

Second slide with ggplot

## [1] -61.69421

With a test statistic of -46.08574 it can be confidently concluded that the P-value is greater than 0.05 and therefore, this is not statistically sigificant.

Slide with R code

Below is the code for the previous plot:

library(ggpubr) null <- rnorm(100, mean = 0, sd = 1) null_df <- data.frame(null)

p <- ggplot(data=null_df, aes(y=null_df[, 1])) + geom_boxplot(fill=‘#A4A4A4’) p