What is hypothesis testing?
A statistical method used to make inferences about a population parameter based on sample data.
What is the objective?
To evaluate whether there is enough evidence to reject or accept the null hypothesis.
What is hypothesis testing?
A statistical method used to make inferences about a population parameter based on sample data.
What is the objective?
To evaluate whether there is enough evidence to reject or accept the null hypothesis.
Hypothesis: A proposed claim that we want to investigate.
Null Hypothesis: \(H_0\)
Represents a default assumption, typically stating no effect or no difference.
Alternative Hypothesis:\(H_1\)
Contradicts the null hypothesis, suggesting an effect or difference exists.
We are going to assume that \(H_0\) is true unless the evidence says that we need to reject it and accept \(H_1\).
Thus, the possible outcomes include:
- Reject the Null Hypothesis \(H_0\) and accept the Alternative Hypothesis \(H_1\).
- Fail to reject the Null Hypothesis \(H_0\).
Level of Confidence: How confident we are in the proposed claim. We denote the level of confidence by c and it takes values such as 95% or 99%.
Level of Significance: We denote it by \(\alpha\) and it is the complement of the level of confidence, being \(\alpha\) = 1 - c.
Type of errors:
Type I Error (α): Rejecting the null hypothesis when it is true.
Type II Error (β): Failing to reject the null hypothesis when it is false.
If p-value < α, reject \(H_0\); otherwise, do not reject \(H_0\).
Z = \(\frac{\bar{X} - \mu}{\sigma / \sqrt{n}}\)
The T-test is used when a sample is unknown or is less than 30. It follows a t-student distribution with n-1 degrees of freedom.
Null Hypothesis: Means are equal
Alternative Hypothesis: Means are unequal
We test the null hypothesis of two populations (\(H_0: \mu_1 = \mu_2\)) against the alternative hypothesis (\(H_1: \mu_1 \neq \mu_2\)). We can use the two-sample t-test:
t = \(\frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}\)
We reject the null hypothesis if the value of t is greater than the critical value.
ggplot #1 R code
xvals <- seq(-10, 10, by = 0.01)
df <- data.frame(x = xvals, y = dnorm(xvals, mean = 0, sd = 1))
ggplot(df, aes(x = x, y = y)) +
geom_line(color = “lightpink”) +
labs(x = “x”, y = “y”)
ggplot #2 R code
xvals <- seq(-4, 4, by = 0.01)
df <- 5
dt <- data.frame(x = xvals, y = dt(xvals, df))
ggplot(dt, aes(x = x, y = y)) +
geom_line(color = “skyblue”) +
labs(x = “x”, y = “y”)