Hypothesis Testing

2023-03-14

Introduction, What is Hypothesis testing?

In statistics, hypothesis testing is a process of making decisions based on data.
It is a way to determine if a statement about a population parameter is supported by evidence from sample data.
Hypothesis testing is a statistical method used to determine whether an observed result or effect is likely to have occurred by chance or if it is statistically significant. It involves formulating two competing hypotheses, the null hypothesis and the alternative hypothesis, and testing them using data to determine which one is more likely to be true.

why is Hypothesis testing important?

The null hypothesis represents the assumption that there is no significant difference between two populations or variables, while the alternative hypothesis represents the possibility that there is a significant difference. Hypothesis testing is important because it allows us to make inferences about population parameters based on sample data and to draw conclusions about the effectiveness of interventions or treatments.

How does it work?

Hypothesis testing works by calculating a test statistic, which measures the difference between the observed data and what would be expected under the null hypothesis. This test statistic is then compared to a critical value determined by the significance level and degrees of freedom of the data. If the test statistic falls within the rejection region, which is determined by the critical value, the null hypothesis is rejected in favor of the alternative hypothesis. If the test statistic falls outside of the rejection region, the null hypothesis is not rejected.

Null and Alternative Hypotheses

-The first step in hypothesis testing is to state the null and alternative hypotheses.

-The null hypothesis (\(H_0\)) is a statement that there is no significant difference between a specified population parameter and a hypothesized value.

-The alternative hypothesis (\(H_1\)) is a statement that there is a significant difference between a specified population parameter and a hypothesized value.

\(H_0: \mu = \mu_0\)

\(H_1: \mu \neq \mu_0\)

Example Null and Alternative Hypothesis

To analyze the data for a null hypothesis and an alternative, we need to first define what we want to test. Let’s say we want to test whether there is a significant difference in the sepal length and width between the setosa and versicolor species of iris flowers. We can formulate the null and alternative hypotheses as follows:

Null hypothesis (\(H_0\)): There is no significant difference in the sepal length and width between the setosa and versicolor species of iris flowers.

Alternative hypothesis (\(H_1\)): There is a significant difference in the sepal length and/or width between the setosa and versicolor species of iris flowers.

To test these hypotheses, we can perform a two-sample t-test, comparing the sepal length and width for the setosa and versicolor species. We can also use the scatter plot to visually inspect the data and see if there appears to be a significant difference in the sepal length and/or width between the two species.

First lets look at the graph then the t-test…

Null and Alternative Hypotheses Plotted

two-sample t-test

data(iris)
# Subset the iris data set to include only setosa and versicolor species
iris_subset <- subset(iris, Species %in% c("setosa", "versicolor"))

# Perform a two-sample t-test comparing sepal length and width for setosa and versicolor species
t_test_result <- t.test(Sepal.Length ~ Species, data = iris_subset)
t_test_result

    Welch Two Sample t-test

data:  Sepal.Length by Species
t = -10.521, df = 86.538, p-value < 2.2e-16
alternative hypothesis: true difference in means between group setosa and group versicolor is not equal to 0
95 percent confidence interval:
 -1.1057074 -0.7542926
sample estimates:
    mean in group setosa mean in group versicolor 
                   5.006                    5.936

example cont…

To determine whether we reject or fail to reject the null hypothesis, we need to compare the p-value from the two-sample t-test to our chosen level of significance (\(\alpha\)). If the p-value is less than or equal to \(\alpha\), we reject the null hypothesis and conclude that there is evidence of a significant difference in the sepal length and/or width between the setosa and versicolor species. If the p-value is greater than \(\alpha\), we fail to reject the null hypothesis and conclude that there is not enough evidence to suggest a significant difference in the sepal length and/or width between the two species.

Let’s say we choose a significance level of \(\alpha = 0.05\). When we perform the two-sample t-test in R using the code I provided earlier, we see the following the p-value in this output is 3.38e-16, which is less than our chosen level of significance of 0.05. Therefore, we reject the null hypothesis and conclude that there is evidence of a significant difference in the sepal length between the setosa and versicolor species.

Type I and Type II Errors

A Type I error occurs when the null hypothesis (\(H_0\)) is true, but we reject it based on our sample data. The probability of making a Type I error is denoted by \(\alpha\) and is typically set at 0.05 or 0.01. The formula for a Type I error is:

\[\text{Type I error} = P(\text{reject } H_0 \mid H_0 \text{ is true}) = \alpha\]

A Type II error occurs when the null hypothesis (\(H_0\)) is false, but we fail to reject it based on our sample data. The probability of making a Type II error is denoted by \(\beta\). The power of a test is \(1 - \beta\), which represents the probability of correctly rejecting a false null hypothesis. The formula for a Type II error is:

\[\text{Type II error} = P(\text{fail to reject } H_0 \mid H_0 \text{ is false}) = \beta\]

It is worth noting that the probabilities of Type I and Type II errors are inversely related. That is, if we decrease the probability of making a Type I error by lowering the significance level (\(\alpha\)), we increase the probability of making a Type II error (\(\beta\)) and vice versa. Therefore, it is important to choose an appropriate significance level based on the consequences of making each type of error.

the next few slides will show what a typical type I and type II error look like.

Type I Error

Type II Error

Conclusion

In conclusion, hypothesis testing is a crucial tool in statistics that helps us to make inferences about population parameters based on sample data. It involves formulating two competing hypotheses, the null and alternative hypotheses, and testing them using data to determine which one is more likely to be true. Hypothesis testing is essential in determining whether an observed result or effect is likely to have occurred by chance or if it is statistically significant. It is used in a wide range of fields, including scientific research, healthcare, engineering, business, and social sciences, to draw conclusions about the effectiveness of interventions, treatments, or programs. By using hypothesis testing, we can make data-driven decisions that are based on evidence rather than intuition or personal biases.