Data Dive - Hypothesis Testing

ASSIGNMENT 7:

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Obesity <- read.csv('/Users/ankit/Downloads/Obesity.csv')

Devise at least two different null hypotheses based on two different aspects (e.g., columns) of your data. For each hypothesis:

Come up with an alpha level, power level, and minimum effect size, and explain why you chose each value.
Determine if you have enough data to perform a Neyman-Pearson hypothesis test. If you do, perform one and interpret results. If not, explain why. Perform a Fisher’s style test for significance, and interpret the p-value.

Alpha level : This is the threshold for statistical significance. It is the chance/probability of making a Type 1 Error, that means rejecting the null hypothesis, even though it is true.

In my case, since I am working on a health related data, so I want the risk of making a type 1 error to be minimum. That’s why I am taking the alpha/ significance level to be 0.01

alpha <- 0.01

Power Level: It is the probability of correctly rejecting a false null hypothesis, i.e. preventing a type 2 error.I’ll chose it to be 0.80, which means I want an 80 % chance of detecting a true effect.

power <- 0.8

Minimum effect size: It is the minimum effect size representing the smallest difference or relationship that we consider practically meaningful.

minimum_effect_size <- 0.2

Data Dive - Hypothesis Testing

Jagriti Mahajan

2023-10-08