- Hypothesis testing
- Type I and II error and power
Confidence intervals
- Reading
- Chapter 1 Inference
- OpenIntro Statistics chapter 4
February 24, 2016
Confidence intervals
One Sample - observations come from one of two population distributions:
Two Sample - two samples are drawn, either:
True state of nature | Result of test | |
---|---|---|
Reject \(H_0\) | Fail to reject \(H_0\) | |
\(H_0\) TRUE | Type I error, probability = \(\alpha\) | No error, probability = \(1-\alpha\) |
\(H_0\) FALSE | No error, probability is called power = \(1-\beta\) | Type II error, probability = \(\beta\) (false negative) |
Although \(\alpha = 0.05\) is a common cut-off for the p-value, there is no set border between “significant” and “insignificant,” only increasingly strong evidence against \(H_0\) (in favor of \(H_A\)) as the p-value gets smaller.
Example: a study with two primary endpoints gets a p-value of 0.055 for one endpoint, and 0.04 for the other endpoint. Should this be interpreted as strong evidence for one endpoint and no evidence for the other endpoint?
Demos: TeachingDemos::power.examp()
, https://mramos.shinyapps.io/PowerCalc/
library(pwr)
are useful for a lot of testsWhich of the following are true?
library(downloader) library(dplyr) url <- "https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/mice_pheno.csv" filename <- "mice_pheno.csv" if (!file.exists(filename)) { download(url, destfile = filename) } dat <- read.csv("mice_pheno.csv")
chowPopulation <- dat[dat$Sex=="F" & dat$Diet=="chow",3] (mu_chow <- mean(chowPopulation))
## [1] 23.89338
In practice, we don't know the true population parameter. We rely on random samples to estimate the population mean. Let's take a sample size of 30 (\(n = 30\)).
set.seed(1) # set a seed for random number generation n <- 30; chow <- sample(chowPopulation, n) chow %>% mean
## [1] 23.351
mean(chow)
follows a normal distribution with mean \(\mu_X\) or 23.8933778, and(se <- sd(chow)/sqrt(n))
## [1] 0.4781652
For a normal sampling distribution and 95% CI:
(Zcrit <- qnorm(1 - 0.05/2))
## [1] 1.959964
pnorm(Zcrit) - pnorm(-Zcrit)
## [1] 0.95
95% CI shown on the \(N(0, 1)\) (\(Z\)) sampling distribution:
(interval <- c(mean(chow) - Zcrit*se, mean(chow) + Zcrit*se ))
## [1] 22.41381 24.28819
t.test(chow)
## ## One Sample t-test ## ## data: chow ## t = 48.835, df = 29, p-value < 2.2e-16 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## 22.37304 24.32896 ## sample estimates: ## mean of x ## 23.351
We show 250 random realizations of 95% confidence intervals. The color denotes if the interval fell on the parameter or not.
For \(n=30\), the CLT works very well. However, what if \(n=5\)? Still trying to use the CLT:
CIs based on the CLT were too small. Need to use t-distribution with \(df=4\):
Using the t-distribution, the size of the intervals increase and cross \(\mu_X\) more frequently, about 95% of the time.
qt(1- 0.05/2, df=4)
## [1] 2.776445
is bigger than…
qnorm(1- 0.05/2)
## [1] 1.959964
Let's do a two-sample test for mouse weights on chow and high-fat diet
dat2 <- read.csv("femaleMiceWeights.csv") controlIndex <- which(dat2$Diet=="chow") treatmentIndex <- which(dat2$Diet=="hf") control <- dat2[controlIndex, 2] treatment <- dat2[treatmentIndex, 2]
t.test(treatment, control)
## ## Welch Two Sample t-test ## ## data: treatment and control ## t = 2.0552, df = 20.236, p-value = 0.053 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -0.04296563 6.08463229 ## sample estimates: ## mean of x mean of y ## 26.83417 23.81333
t.test(treatment, control, conf.level = 0.9)
## ## Welch Two Sample t-test ## ## data: treatment and control ## t = 2.0552, df = 20.236, p-value = 0.053 ## alternative hypothesis: true difference in means is not equal to 0 ## 90 percent confidence interval: ## 0.4871597 5.5545070 ## sample estimates: ## mean of x mean of y ## 26.83417 23.81333