Statistics is concerned with gathering, evaluating, presenting, and interpreting data. A statistical technique called hypothesis testing is used to assess statements regarding a population parameter based on sample data.
2024-06-04
Statistics is concerned with gathering, evaluating, presenting, and interpreting data. A statistical technique called hypothesis testing is used to assess statements regarding a population parameter based on sample data.
There are two main types of hypothesis testing:
Scenario: We want to test if a coin is fair (H0: p = 0.5) or biased (Ha: p ≠ 0.5), where p is the probability of heads. We flip the coin 100 times and get 60 heads.
flips <- rbinom(100, 1, 0.5) # 100 flips with probability of heads = 0.5
binom.test(sum(flips), 100, 0.5, alternative = “two.sided”)
## ## Exact binomial test ## ## data: sum(flips) and 100 ## number of successes = 46, number of trials = 100, p-value = 0.4841 ## alternative hypothesis: true probability of success is not equal to 0.5 ## 95 percent confidence interval: ## 0.3598434 0.5625884 ## sample estimates: ## probability of success ## 0.46
The calculated p-value is likely to be high (greater than 0.05) since we expect 60 heads from a fair coin. Therefore, we fail to reject the null hypothesis and conclude there’s not enough evidence to say the coin is biased.
The probability of getting k heads in n flips from a fair coin (p = 0.5) follows the binomial distribution:
\[ P(X=k)=\binom{n}{k}p^k(1-p)^{n-k} \]
where:
P(X = k) is the probability of getting k heads. n is the number of flips. k is the number of heads. p is the probability of heads (0.5 for a fair coin).
\[ \text{Reject } H_0 \text{ if } p\text{-value} < \alpha \\ \text{Fail to Reject } H_0 \text{ if } p\text{-value} \geq \alpha \]