Null Hypothesis
- The statement of “no effect”; e.g., mood does not affect performance
- Represented by H0
Alternative Hypothesis
- The statement of “effect” that is undertaken to be proven; e.g., mood affects performance
- Represented by Ha
2024-01-29
Null Hypothesis
- The statement of “no effect”; e.g., mood does not affect performance
- Represented by H0
Alternative Hypothesis
- The statement of “effect” that is undertaken to be proven; e.g., mood affects performance
- Represented by Ha
First calculate the test stastic using either z-test or t-test
Examples: - One-sample Z-test \[
z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}}
\] - Two-sample T-test \[
t = |\frac{{\bar{X}_1 - \bar{X}_2}}{{\sqrt{\frac{{s_1^2}}{{n_1}} + \frac{{s_2^2}}{{n_2}}}}}|
\] This determines how mathematically significant the difference between the sample and test population is.
- The normal distribution curve helps us interpret test results
- Assuming no effect, the p-value would be directly in the middle
- Any deviation begins to indicate significance
- A researcher also sets a desired significance level which is used to accept or reject the null hypothesis
- This bell curve illustrates a 95% confidence interval or CI
- In order to accept an alternative hypothesis under this CI a test statistic would need to place in one of the ends
The formula for CDF is as follows: \(F(x) = P(X \leq x)\)
Where:
- P() represents a function of probability,
- X is a random variable
- and x is the selected value of cumulative probability
## [1] 3.92507e-05
## [1] 3.92507e-05
dnorm_label <- function(x, mean, sd) { dnorm(x, mean = mean, sd = sd) }
mean_value <- 0 sd_value <- 1 set.seed(123)
data <- data.frame(x = seq(-3, 3, length.out = 1000), y = dnorm(seq(-3, 3, length.out = 1000)))
data\(center <- cut(data\)x, breaks = c(-Inf, -1.96, 1.96, Inf), labels = c(“Lower”, “Middle”, “Upper”))
ggplot(data, aes(x = x, y = y)) + geom_line(color = “blue”) + geom_ribbon(data = subset(data, center == “Middle”), aes(ymin = 0, ymax = y), fill = “gray”, alpha = 0.5) + labs(title = “Bell Curve with 95% Confidence Interval”, x = “X-axis”, y = “Density”) + theme_minimal() ggplot(data.frame(x = c(-4, 4)), aes(x = x)) + stat_function(fun = dnorm_label, args = list(mean = mean_value, sd = sd_value), color = “blue”) +
labs(title = “Normal Distribution with P-value Labels”, x = “Standard Deviations from Mean”, y = “Density”) + theme_minimal()’
set.seed(42) data <- rnorm(1000, mean = 0, sd = 1)
sorted_data <- sort(data) cumulative_prob <- seq(0, 1, length.out = length(sorted_data))
cumulative_prob <- cumulative_prob / max(cumulative_prob)
cdf_plot <- plot_ly(x = sorted_data, y = cumulative_prob, type = “scatter”, mode = “lines”, name = “CDF”)
layout <- list( title = “Cumulative Distribution Function (CDF)”, xaxis = list(title = “Data”), yaxis = list(title = “Cumulative Probability”) )
cdf_plot <- cdf_plot %>% layout(layout)
cdf_plot