class: title-slide .row[ .col-7[ .title[ # Hypothesis Testing ] .subtitle[ ## Hypothesis Testing ] .author[ ### Laxmikant Soni <br> [blog](https://laxmikants.github.io) <br> [<i class="fab fa-github"></i>](https://github.com/laxmiaknts) [<i class="fab fa-twitter"></i>](https://twitter.com/laxmikantsoni09) ] .affiliation[ ] ] .col-5[ .logo[ <img src="figures/rmarkdown.png" width="480" /> ] ] ] --- # Statistical Hypothesis Testing .pull-left[ ## Hypothesis Testing * **Definition**: Hypothesis testing is a statistical method that uses sample data to evaluate a hypothesis about a population parameter. * **Key Terms**: - **Null Hypothesis `\(H_0\)`**: Assumes no effect or no difference in the population. - **Alternative Hypothesis `\(H_1\)`**: Assumes an effect or difference exists. - **p-value**: Probability of obtaining a test result at least as extreme as the one observed, assuming `\(H_0\)` is true. * **Example**: Testing if a new drug is more effective than a placebo. `\(H_0\)`: The drug has no effect. `\(H_1\)`: The drug has a positive effect. ] -- .pull-right[ * **Hypothesis Testing Steps**: 1. **State the Hypotheses**: Define `\(H_0\)` and `\(H_1\)` 2. **Select a Significance Level `\(alpha\)`**: Common choices are 0.05, 0.01. 3. **Compute the Test Statistic**: Based on the sample data (e.g., `\(t-statistic\)`, `\(z-score\)`. 4. **Calculate the p-value**: Determines the strength of the evidence against `\(H_0\)`. 5. **Make a Decision**: Reject `\(H_0\)` if `\(p \leq alpha\)`; otherwise, do not reject `\(H_0\)`. * **Formula for a Test Statistic (e.g., t-test)**: `\(t = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}}\)` ] --- # Statistical Hypothesis Testing .pull-top[ ## Z-Test | **Test Name** | **Use** | **Example** | |------------------|-----------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------| | **Z-Test** | For testing means with a known population variance and a large sample size (typically \( n > 30 \)). | Testing if the mean height of a group of students differs from the known population mean height of 160 cm. | ] -- .pull-bottom[ Example Statement: "A researcher wants to determine if the average weight of adult males in a specific city differs from the known national average weight of 180 pounds. A random sample of 50 adult males from the city is taken, and their average weight is found to be 185 pounds with a known population standard deviation of 15 pounds. The researcher will use a Z-Test to assess whether this difference is statistically significant." `$$Z = \frac{\bar{X} - \mu}{\frac{\sigma}{\sqrt{n}}}$$` ] --- # Statistical Hypothesis Testing .pull-top[ ## t-Test | **Test Name** | **Use** | **Example** | |-------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------| | **t-Test** | When testing means with an unknown population variance and a small sample size (typically \( n < 30 \)). | Testing if a new teaching method affects test scores by comparing scores before and after implementing the method. | | - *One-sample t-test* | Compares sample mean to a known population mean. | Testing if the average score of a class of 20 students on a math exam is different from the known average score of 75. | | - *Independent (two-sample) t-test* | Compares means of two independent groups. | Comparing the test scores of students from two different schools to see if one school performs better than the other. | | - *Paired t-test* | Compares means of two related groups (e.g., before-and-after measurements). | Measuring the weight of participants before and after a diet program to see if there is a significant weight loss. | ] -- .pull-bottom[ `\(t = \frac{\bar{X} - \mu}{\frac{s}{\sqrt{n}}}\)` ] --- # Statistical Hypothesis Testing .pull-top[ ## F-Test | **Test Name** | **Use** | **Example** | |-------------------------|---------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------| | **F-Test** | To compare variances of two populations, often as a preliminary test before conducting an ANOVA. | Testing if two manufacturing processes have different levels of variability in production. | ] -- .pull-bottom[ A company wants to compare the variability in assembly times between two different manufacturing processes to ensure consistent production. They collect sample assembly times from each process and use an F-Test to determine if there is a significant difference in the variances of the two processes `$$F = \frac{\text{Variance}_1}{\text{Variance}_2}$$` ] --- # Statistical Hypothesis Testing .pull-top[ ## ANOVA **ANOVA (Analysis of Variance)** - **Types**: - **One-way ANOVA**: Tests for differences in means among three or more independent groups. - **Two-way ANOVA**: Tests for the influence of two categorical variables on a continuous outcome. - **Use**: When comparing means across multiple groups (more than two). - **Example**: Testing if three different fertilizers result in different crop yields. ] -- .pull-bottom[ The formula for one-way ANOVA is given by: `$$F = \frac{\text{MS}_{\text{between}}}{\text{MS}_{\text{within}}}$$` Where: - `\(F\)` = F statistic - `\(\text{MS}_{\text{between}}\)` = Mean Square Between Groups - `\(\text{MS}_{\text{within}}\)` = Mean Square Within Groups ] --- # Statistical Hypothesis Testing .pull-top[ ## ANOVA Sum of Squares Calculations The sum of squares for between and within groups are calculated as follows: 1. **Sum of Squares Between Groups** `\(SS_{\text{between}}\)`: `$$SS_{\text{between}} = \sum_{i=1}^{k} n_i (\bar{X}_i - \bar{X})^2$$` Where: - `\(k\)` = number of groups - `\(n_i\)` = number of observations in group `\(i\)` - `\(\bar{X}_i\)` = mean of group `\(i\)` - `\(\bar{X}\)` = overall mean of all groups combined ] --- # Statistical Hypothesis Testing .pull-top[ ## ANOVA Sum of Squares Calculations 2. **Sum of Squares Within Groups** `\(SS_{\text{within}}\)`: `$$SS_{\text{within}} = \sum_{i=1}^{k} \sum_{j=1}^{n_i} (X_{ij} - \bar{X}_i)^2$$` Where: - `\(X_{ij}\)` = individual observation `\(j\)` in group `\(i\)` - `\(\bar{X}_i\)` = mean of group `\(i\)` ] --- # Statistical Hypothesis Testing .pull-top[ ## Chi-Square Test * **Definition**: The Chi-Square Test is a statistical test used to determine if there is a significant association between categorical variables. It is particularly useful for examining the independence of variables in a contingency table. * **Types of Chi-Square Tests**: - **Chi-Square Test for Independence**: Tests if there is a relationship between two categorical variables. - **Chi-Square Goodness-of-Fit Test**: Tests if the observed distribution of a single categorical variable fits an expected distribution. ] --- # Statistical Hypothesis Testing .pull-top[ ## Chi-Square Test * **Formula**: The Chi-Square statistic ($\chi^2$) is calculated as follows: $$ \chi^2 = \sum \frac{(O - E)^2}{E} $$ Where: - `\(\chi^2\)` = Chi-Square statistic - `\(O\)` = observed frequency - `\(E\)` = expected frequency * **Example**: A researcher wants to test if there is an association between smoking habits (smoker, non-smoker) and exercise frequency (low, medium, high) among a group of adults. They collect a sample and record the observed frequencies in each category combination. They then use the Chi-Square Test for Independence to determine if there is a statistically significant association between smoking status and exercise frequency. ] --- class: inverse, center, middle # Thanks