An Illustration of Type I Error

Type I error (false alarms) occurs when a statistical test signals a significant difference where one does not exist.

If we took take samples from the same population, we'd expect the difference between the mean of samples 1 and 2 to be very small (and hence not statistically significant). However, once in awhile, our statistical tests will mess up. How often? It depends on your significance threshold \( \alpha \) (alpha).

The following code takes two random samples from same the population. Uses a T-test to see if their means are equal. It will continue until a Type 1 error is committed, the counter tracks the number successful tests before a Type I error occurs.

# An illustration of TYPE 1 Error Create a population with mean=100 and
# sd=20
pop <- rnorm(1e+06, mean = 100, sd = 20)

# Take two random samples from the population
samp1 <- sample(pop, 100, repl = F)
samp2 <- sample(pop, 100, repl = F)

# the samples should be similar but not the same
summary(samp1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    57.6    87.7    98.7    99.0   107.0   144.0
summary(samp1)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    57.6    87.7    98.7    99.0   107.0   144.0

# test the sample means
test.result <- t.test(samp1, samp2)

The code above illustrates the use of the sample(), t.test(), rnorm(), and summary() functions. These should all be in your r-journals. See the lecture notes for help interpreting the t.test results.

The code below illustrates the prevalance of type 1 errors. It allows you to choose a signifigance threshold (alpha). It will continue taking random samples from the same population until it committs a Type 1 error. Note, the code below is complicated, if you are new to programming ignore it…

alpha <- 0.05  #signifigance threshold
counter <- 1  #count repetitions
repeat {
    samp1 <- sample(pop, 100, repl = F)
    samp2 <- sample(pop, 100, repl = F)
    test.result <- t.test(samp1, samp2)
    print(counter)
    if (test.result$p.value < alpha) 
        break
    counter <- counter + 1
}
## [1] 1
## [1] 2
## [1] 3
test.result  #prints the result of the test containing type 1 error
## 
##  Welch Two Sample t-test
## 
## data:  samp1 and samp2 
## t = 2.379, df = 197.2, p-value = 0.01833
## alternative hypothesis: true difference in means is not equal to 0 
## 95 percent confidence interval:
##   1.13 12.10 
## sample estimates:
## mean of x mean of y 
##    105.66     99.04

Advanced Programmers Only!

An illustration of how alpha affects the prevalance of type 1 errors.

numTests <- 1000  #number of t-test
alphaSet = c(0.001, 0.01, 0.05, 0.1, 0.2)  #set of alpha values to test
sigTests <- matrix(nrow = length(alphaSet) * numTests, ncol = 3)

counter <- 1
for (i in 1:numTests) {
    for (alpha in alphaSet) {
        # take two samples from the same population
        samp1 <- sample(pop, 100, repl = F)
        samp2 <- sample(pop, 100, repl = F)
        # test sample means
        test.result <- t.test(samp1, samp2)
        # recored results of test
        if (test.result$p.value < alpha) {
            sigTests[counter, 1] <- 1
            sigTests[counter, 2] <- test.result$p.value
            sigTests[counter, 3] <- alpha
        } else {
            sigTests[counter, 1] <- 0
            sigTests[counter, 2] <- test.result$p.value
            sigTests[counter, 3] <- alpha
        }
        counter <- counter + 1
    }
}

sigTests <- as.data.frame(sigTests)  #convert to data.frame object (easier to manupulate)
names(sigTests) <- c("type_1_errors", "p-value", "alpha")  #assign column names

The results below show how alpha relates to Type 1 errors. For example, with an alpha of .2 roughly 20% of the time you should get a false alarm.

aggregate(sigTests$type_1_errors ~ sigTests$alpha, FUN = sum)  #produce results of experiment
##   sigTests$alpha sigTests$type_1_errors
## 1          0.001                      1
## 2          0.010                     10
## 3          0.050                     43
## 4          0.100                    110
## 5          0.200                    222

An illustration of how alpha affects the prevalance of Type 2 errors.

numTests <- 1000  #number of t-test
difference <- 5  #difference in group means
alphaSet = c(0.001, 0.01, 0.05, 0.1, 0.2)  #set of alpha values to test
sigTests <- matrix(nrow = length(alphaSet) * numTests, ncol = 3)

counter <- 1
for (i in 1:numTests) {
    for (alpha in alphaSet) {
        # take two samples from DIFFERENT populations
        samp1 <- rnorm(100, mean = 100, sd = 10)
        samp2 <- rnorm(100, mean = 100 + difference, sd = 10)
        test.result <- t.test(samp1, samp2)
        if (test.result$p.value > alpha) {
            sigTests[counter, 1] <- 1
            sigTests[counter, 2] <- test.result$p.value
            sigTests[counter, 3] <- alpha
        } else {
            sigTests[counter, 1] <- 0
            sigTests[counter, 2] <- test.result$p.value
            sigTests[counter, 3] <- alpha
        }
        counter <- counter + 1
    }
}

sigTests <- as.data.frame(sigTests)  #convert to data.frame object (easier to manupulate)
names(sigTests) <- c("type_2_errors", "p-value", "alpha")  #assign column names

The results below show that when alpha is really low Type II errors become more common.

aggregate(sigTests$type_2_errors ~ sigTests$alpha, FUN = sum)  #produce results of experiment
##   sigTests$alpha sigTests$type_2_errors
## 1          0.001                    374
## 2          0.010                    162
## 3          0.050                     56
## 4          0.100                     42
## 5          0.200                      9