Statistical hypothesis testing is the formal inferential framework around choosing between hypotheses. The null hypothesis is assumed true, H0, and statistical evidence is required to reject it in favor of a research or alternative hypothesis, Ha.
| Truth | Decide | Result |
|---|---|---|
| \(H_0\) | \(H_0\) | Correctly accept null |
| \(H_0\) | \(H_a\) | Type I error |
| \(H_a\) | \(H_a\) | Correctly reject null |
| \(H_a\) | \(H_0\) | Type II error |
\[ \begin{align} 0.05 & = P\left(\bar X \geq C ~|~ \mu = 30 \right) \\ & = P\left(\frac{\bar X - 30}{10 / \sqrt{100}} \geq \frac{C - 30}{10/\sqrt{100}} ~|~ \mu = 30\right) \\ & = P\left(Z \geq \frac{C - 30}{1}\right) \\ \end{align} \] * Hence \((C - 30) / 1 = 1.645\) implying \(C = 31.645\) * Since our mean is \(32\) we reject the null hypothesis
Using the R package, the data and the father.son dataset, we can test whether the population of son’s height was equivalent to the population mean of father’s heights by taking take the difference and we want to test whether the difference in the heights is 0 or its non zero, we do that with t.test.
## Warning: package 'UsingR' was built under R version 4.0.3
## Loading required package: MASS
## Loading required package: HistData
## Warning: package 'HistData' was built under R version 4.0.3
## Loading required package: Hmisc
## Warning: package 'Hmisc' was built under R version 4.0.3
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Warning: package 'Formula' was built under R version 4.0.3
## Loading required package: ggplot2
##
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
##
## format.pval, units
##
## Attaching package: 'UsingR'
## The following object is masked from 'package:survival':
##
## cancer
##
## One Sample t-test
##
## data: father.son$sheight - father.son$fheight
## t = 11.789, df = 1077, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 0.8310296 1.1629160
## sample estimates:
## mean of x
## 0.9969728
Here the result has a t of 11.79, so we reject the null hypothesis. You can also see whether the range of values in the confidence interval are of practical significance as it is expressed in the units of the data that you’re interested in.
\[ .05 = P\left(\frac{\bar X - 30}{s / \sqrt{16}} \geq t_{1-\alpha, 15} ~|~ \mu = 30 \right) \] - So that our test statistic is now $(32 - 30) / 10 = 0.8 $, while the critical value is \(t_{1-\alpha, 15} = 1.75\). - We now fail to reject.
Suppose a friend has \(8\) children, \(7\) of which are girls and none are twins - Perform the relevant hypothesis test. \(H_0 : p = 0.5\) \(H_a : p > 0.5\) - What is the relevant rejection region so that the probability of rejecting is (less than) 5%?
print(pbinom(-1, size = 8, p = .5, lower.tail = FALSE))
## [1] 1
print(pbinom( 0, size = 8, p = .5, lower.tail = FALSE))
## [1] 0.9960938
print(pbinom( 1, size = 8, p = .5, lower.tail = FALSE))
## [1] 0.9648438
print(pbinom( 2, size = 8, p = .5, lower.tail = FALSE))
## [1] 0.8554688
print(pbinom( 3, size = 8, p = .5, lower.tail = FALSE))
## [1] 0.6367187
print(pbinom( 4, size = 8, p = .5, lower.tail = FALSE))
## [1] 0.3632813
print(pbinom( 5, size = 8, p = .5, lower.tail = FALSE))
## [1] 0.1445313
print(pbinom( 6, size = 8, p = .5, lower.tail = FALSE))
## [1] 0.03515625
print(pbinom( 7, size = 8, p = .5, lower.tail = FALSE))
## [1] 0.00390625