10.1 - Error types and probabilities

Introduction

  • Recall a central goal of statistics is to learn about a parameter \(\theta\) that governs the population or underlying data generating mechanism from which the data came.

  • The objective of a statistical test is to test a hypothesis concerning the values of one or more of the population parameters. We generally have a theory, or a research hypothesis, about the parameter(s) that we are seeing if the data support.

  • The research hypothesis is often denoted by \(H_a\), also known as the alternative hypothesis.

  • The default state, or the hypothesis that we “fall back on” if \(H_a\) cannot be proven, is the null hypothesis, or \(H_0\).

Decision rules for testing hypotheses

Hypothesis tests are typically carried out in the following manner:

  1. Specify \(H_0\) and \(H_a\), two mutually exclusive states of reality.
  2. Obtain a test statistic \(T\), which contains information about \(\theta\).
  3. Specify a critical value \(c\) and corresponding rejection region (RR).
  4. If \(T\) falls in RR, reject \(H_0\). Otherwise fail to reject.

Decision errors

  • It’s of course possible that we make the incorrect decision, given the true state of reality.
  • The types of errors we can make under the two possible states of reality (the “null reality” and the “alternative reality”) can be summarized in 2x2 table format:
Decision \(H_0\) true \(H_0\) false
Reject \(H_0\) Type-I Error Correct decision
Fail to reject \(H_0\) Correct decision Type-II error

Decision probabilities

  • The probabilities of the two error types we notate with \(\alpha\) and \(\beta\).
  • \(\alpha\):
    • The probability of committing a Type-I error, i.e. \(P(Reject\ H_0| H_0\ true)\).
    • Equivalently: \(P(T \in RR | H_0\ true)\).
    • Often referred to as the significance level or the size of the test.
  • \(\beta\):
    • The probability of committing a Type-II error is \(P(Fail\ to\ reject\ H_0|H_0\ false)\).
    • Equivalently: \(P(T \not \in RR | H_0\ false)\).
  • The power of a test is \(1-\beta = P(Reject\ H_0|H_0\ false)\).

The \(\alpha\)-\(\beta\) relationship

  • Implication: Power can always be trivially increased if \(\alpha\) is increased
    • Consider: what sort of “brain dead” test would have \(Power=1\)? What is \(\alpha\) for this test?
  • In practice: set \(\alpha\) to something small, try to maximize power.

Finding \(\alpha\) and \(\beta\)

  • To find \(\alpha\):
    • Specify sampling distribution of \(T\) under \(H_0\) state of reality
    • Find \(P(T\in RR | H_0\ true)\)
  • To find \(\beta\)/power:
    • Specify sampling distribution of \(T\) under an \(H_a\) state of reality
    • Find \(\beta = P(T\not\in RR | H_a\ state\ of\ reality)\)
    • \(Power = 1-\beta\), equivalently \(Power= P(T\in RR | H_a\ state\ of\ reality)\)

Example: balls in bucket

  • “Population”: a bucket with 10 balls.
  • \(H_0\) bucket: 5 red and 5 blue balls.
  • \(H_a\) bucket: 10 balls, more than half are red.
  • Sample: 10 balls drawn with replacement from the bucket.
  • Test statistic \(T\): number of red balls in the draw.
  • Decision rule: \(c=8 \Rightarrow\) reject \(H_0\) if \(T \ge 8\).
  1. What is \(\alpha\)?
  2. What would we need to specify to find \(\beta\) and power?

Question 1:

  • Under \(H_0, T\sim BIN(10, 0.5)\).
  • \(P(T\ge 8) = \sum_{t=8}^{10}\binom{10}{t}0.5^t0.5^{10-t}\).
1-pbinom(7,10,0.5)
[1] 0.0546875
  • \(\therefore \alpha = 0.0547\)

Question 2:

  • To find \(\beta\) or power, we need to specify an \(H_a\) state of reality, of which there are multiple possibilities!
  • One such state: \(H_a\) bucket contains 6 red and 4 blue
  • Under this state: \(T\sim BIN(10, 0.6)\)
  • Then \(\beta = P(T\le 7) = \sum_{t=0}^{7}\binom{10}{t}0.6^t0.4^{10-t}\)
pbinom(7,10,0.6)
[1] 0.8327102
  • \(\therefore \beta = 0.8327\), Power = \(0.1673\)

Example: exponential rate

  • Suppose \(Y_1,...,Y_{15} \stackrel{i.i.d.}{\sim}EXP(\lambda)\)
  • Will use sample to test:

\[H_0: \lambda = 1/5\] \[H_a: \lambda < 1/5\]

  • Test statistic: \(T=Y_{(1)}\)
  • Decision rule: reject \(H_0\) if \(Y_{(1)} \geq c\) (why does this make sense?)

Tasks:

  1. Find \(\alpha\) for \(c \in \{0.2, 0.8\}\).
  2. How should we set \(c\) to make \(\alpha = 0.05\)?
  3. If in fact \(\lambda = 1/25\), find \(\beta\) and the power if we use a size-\(\alpha=0.05\) test.
  4. Find an expression for the power as a function of any \(\lambda_a\).

Task 1: finding \(\alpha\) for set \(c\)

  • To find \(\alpha\), need to specify distribution of test statistic \(Y_{(1)}\) under \(H_0\).
  • Have shown that, in general:

\[Y_{(1)}\sim EXP(n\lambda)\]

  • With \(n=15\), under \(H_0: \lambda = 1/5,\) we have:

\[Y_{(1)}\sim EXP(15\cdot 1/5)=EXP(3)\]

Task 1: finding \(\alpha\) for set \(c\)

When \(c = 0.2\):

\[\alpha = P(Reject|H_0\ true)\] \[=P(Y_{(1)} > 0.2) , Y_{(1)}\sim EXP(15\cdot 1/5)\]

\[ = e^{-0.2\cdot 3} = 0.5488\]

1-pexp(0.2, rate = 15*1/5)
[1] 0.5488116

When \(c = 0.8\):

\[\alpha = P(Reject|H_0\ true)\] \[=P(Y_{(1)} > 0.8), Y_{(1)}\sim EXP(15\cdot 1/5)\]

\[ = e^{-0.8\cdot 3} = 0.0907\]

1-pexp(0.8, rate = 15*1/5)
[1] 0.09071795

Task 2: Find \(c\) such that \(\alpha = 0.05\)

  • Under \(H_0: \lambda = 1/5\Rightarrow Y_{(1)} \sim EXP(15\cdot 1/5)\)

\[\alpha = P(Y_{(1)}> c) = e^{-c\cdot 3}\stackrel{set}{=}0.05\]

\[\Rightarrow c = -\frac{\ln(0.05)}{3} = 0.9985774\]

Task 3. Find \(\beta\) and power if \(\lambda = 1/25\)

  • Finding \(\beta\) and power for the size-0.05 test:

\[\beta = P(Fail\ to\ reject| H_0\ false) = P(Y_{(1)} < 0.9985774 | \lambda < 1/5)\]

  • There are infinitely many \(\lambda\) for which \(H_0\) is false, so we evaluate for one “\(H_a\)” reality; \(\lambda = 1/25\):

\[\beta = P(Y_{(1)} < 0.9985774), Y_{(1)} \sim EXP(15\cdot 1/25) \]

\[ = 1-e^{-0.9985774\cdot 0.6} = 0.4507\]

pexp(0.9985774, rate = 15/25)
[1] 0.4507197
  • Power = \(1-\beta\) = 0.5493

Task 4. Find power as function of \(\lambda_a\)

  • For our size-0.05 test:

\[Power = P(Reject| H_0\ false) = P(Y_{(1)} > 0.9985774 | \lambda =\lambda_a < 1/5)\]

  • For any \(\lambda_a <1/5\):

\[P(Y_{(1)} > 0.9985774) = e^{- 0.9985774 \cdot 15\cdot \lambda_a}\]

  • Interactive Desmos version: https://www.desmos.com/calculator/dbqkfi2ru7

Plotting power as function of \(\lambda_a\)

library(tidyverse)
ggplot() + 
  geom_function(fun=\(lam) 1-pexp(0.9985774, rate = 15*lam)) + 
  xlim(c(0.001, 1/5)) + 
  geom_hline(aes(yintercept = 0.05), linetype=2)+ 
  scale_y_continuous(breaks = c(0.05, seq(0.2,1,by=.2)))+
  labs(y='Power', x = expression(lambda[a]))+
  theme_classic(base_size=14)

Example: Beta shape

  • A sample of size \(n=1\) is drawn from a population with the following pdf:

\[f_Y(y) = (1+\theta) y^\theta; 0\leq y \leq 1.\]

  • Test:

\[H_0: \theta = 2\] \[H_a: \theta > 2\]

Tasks:

  1. Should we reject \(H_0\) for large or small \(Y\)?
  2. What is the size of the test if the decision rule is to reject for \(Y>3/4\)?
  3. What is the RR for a size-5% test?
  4. Find the power function of the size-5% test as a function of \(\theta_a\).

Task 1: Reject for large or small \(Y\)?

  • Since \(Y\sim BETA(\theta+1,1)\):

\[E(Y) = \frac{\theta+1}{\theta+2}\]

  • Larger values of \(\theta\Rightarrow\) larger expected \(Y\); makes sense to reject \(H_0\) in favor of \(H_a: \theta= ``large"\) for large values of \(Y\).

Task 2: Finding size when \(c=3/4\)

  • In general, for \(y\in (0,1)\):

\[P(Y > y) = \int_y^1 (1+\theta) t^\theta dt = 1-y^{\theta+1}\]

  • Under \(H_0: \theta=2\),

\[\alpha = P(Reject|H_0\ true) = P(Y>3/4| \theta=2) = 1-(3/4)^{2+1} = 0.5781\]

Task 3: Finding \(c\) so that \(\alpha = 0.05\)

\[0.05 \stackrel{set}{=} P(Y>c| \theta=2) = 1-c^{2+1}\]

\[\Rightarrow c^3 = 0.95 \Rightarrow c = \sqrt[3]{0.95} = 0.9830476\]

Task 4: Finding power function

\[Power = P(Reject|H_0\ false) = P(Y>0.9830476| \theta=\theta_a > 2)\] \[= 1-0.9830476^{\theta_a+1}\]

library(tidyverse)
ggplot() + 
  geom_function(fun=\(theta) 1-0.9830476^(theta+1)) + 
  geom_hline(aes(yintercept = 0.05), linetype=2)+ 
  scale_y_continuous(breaks = c(0.05, seq(0.2,1,by=.2)))+
  scale_x_continuous(breaks = c(2, seq(50,200,by=50)),
                       limits = c(2,200))+
  labs(y='Power', x = expression(theta[a]))+
  theme_classic(base_size=14)