hw3

2024-10-19

R Markdown

This is an R Markdown presentation. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.

Slide 1: Introduction to p-value

In statistics, p-value refers to the probability that our observed results are due to random chance
p-value is generally used in hypothesis testing to determine how accurate our hypothesized results may be
We typically seek very low p-values of .05 or lower

Slide 2: Why is P-Value Important

p-value tells us how likely it is that our null hypothesis is true
Without p-value, statisticians would need other tools to determine chance of randomness as the cause of an obvserved result

Slide 3: Mathematical Definition of P-Value Using LAtex

The p-value is calculated as:

\[ P = P(\text{data} | H_0) \]

Where \(H_0\) represents the null hypothesis.

Slide 4: Example of P-Value Interpretation

If the p-value is small (typically \(p < 0.05\)):
- We reject the null hypothesis.
If the p-value is large (typically \(p \geq 0.05\)):
- We fail to reject the null hypothesis.

Slide 5: 3D Plotly Visualization Using plotly with code included

Code is included on this slide to show how we’ve generated an example distribution

library(plotly)

# using 3D visual here (mentioned in HW3 requirments doc)
x <- seq(-3, 3, length.out = 50)
y <- seq(-3, 3, length.out = 50)
z <- outer(x, y, function(a, b) dnorm(a) * dnorm(b))

plot_ly(x = ~x, y = ~y, z = ~z, type = "surface") %>%
  layout(scene = list(zaxis = list(title = "Density")))

Slide 6: Get the p-value using hyp testing on our plot using ggplot

Let’s use 2 random distributions to show spread, variability, median, etc. before running a t-test and other hypothesis testing
The box plot makes it easier for us to predict the p-value before even running the test because we as humans can see the differences in the groups ourselves

Slide 7: Use T-Test to show where p-value is using ggplot

Because we have 2 samples, we will be using T-test
Other statistical tests include chi-squared tests, correlation tests, etc.

Slide 8: Mathematical Interpretation of Our Results P-Value using Latex

The p-value for a test statistic is calculated as:

\[ p = P(T \geq t_{\text{obs}}) \]

This probability is derived based on the cumulative distribution function (CDF) of the test statistic
Specifically, in a t-test, the p-value is calculated using the following formula:

\[ p = 2 \times P(T \geq |t_{\text{obs}}|) \]

which accounts for any two-tailed hypothesis

Slide 9: Our Conclusion

After performing a t-test comparing the means of Group A and Group B, we obtained a p-value of 0.
Interpretation:
- Since the p-value is less than 0.05, we reject the null hypothesis \(H_0\).
- This means that there is strong evidence to suggest that the difference in means between Group A and Group B is statistically significant.