2024-10-19

1

R Markdown

This is an R Markdown presentation. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.

Slide 1: Introduction to p-value

  • In statistics, p-value refers to the probability that our observed results are due to random chance
  • p-value is generally used in hypothesis testing to determine how accurate our hypothesized results may be
  • We typically seek very low p-values of .05 or lower

Slide 2: Why is P-Value Important

  • p-value tells us how likely it is that our null hypothesis is true
  • Without p-value, statisticians would need other tools to determine chance of randomness as the cause of an obvserved result

Slide 3: Mathematical Definition of P-Value Using LAtex

  • The p-value is calculated as:

\[ P = P(\text{data} | H_0) \]

  • Where \(H_0\) represents the null hypothesis.

Slide 4: Example of P-Value Interpretation

  • If the p-value is small (typically \(p < 0.05\)):
    • We reject the null hypothesis.
  • If the p-value is large (typically \(p \geq 0.05\)):
    • We fail to reject the null hypothesis.

Slide 5: 3D Plotly Visualization Using plotly with code included

  • Code is included on this slide to show how we’ve generated an example distribution
library(plotly)

# using 3D visual here (mentioned in HW3 requirments doc)
x <- seq(-3, 3, length.out = 50)
y <- seq(-3, 3, length.out = 50)
z <- outer(x, y, function(a, b) dnorm(a) * dnorm(b))

plot_ly(x = ~x, y = ~y, z = ~z, type = "surface") %>%
  layout(scene = list(zaxis = list(title = "Density")))

Slide 6: Get the p-value using hyp testing on our plot using ggplot

  • Let’s use 2 random distributions to show spread, variability, median, etc. before running a t-test and other hypothesis testing
  • The box plot makes it easier for us to predict the p-value before even running the test because we as humans can see the differences in the groups ourselves

Slide 7: Use T-Test to show where p-value is using ggplot

  • Because we have 2 samples, we will be using T-test

  • Other statistical tests include chi-squared tests, correlation tests, etc.

Slide 8: Mathematical Interpretation of Our Results P-Value using Latex

  • The p-value for a test statistic is calculated as:

\[ p = P(T \geq t_{\text{obs}}) \]

  • This probability is derived based on the cumulative distribution function (CDF) of the test statistic

  • Specifically, in a t-test, the p-value is calculated using the following formula:

\[ p = 2 \times P(T \geq |t_{\text{obs}}|) \]

  • which accounts for any two-tailed hypothesis

Slide 9: Our Conclusion

  • After performing a t-test comparing the means of Group A and Group B, we obtained a p-value of 0.

  • Interpretation:

    • Since the p-value is less than 0.05, we reject the null hypothesis \(H_0\).
    • This means that there is strong evidence to suggest that the difference in means between Group A and Group B is statistically significant.