10.1 - Error types and probabilities

Introduction

Recall a central goal of statistics is to learn about a parameter \(\theta\) that governs the population or underlying data generating mechanism from which the data came.
The objective of a statistical test is to test a hypothesis concerning the values of one or more of the population parameters. We generally have a theory, or a research hypothesis, about the parameter(s) that we are seeing if the data support.
The research hypothesis is often denoted by \(H_a\), also known as the alternative hypothesis.
The default state, or the hypothesis that we “fall back on” if \(H_a\) cannot be proven, is the null hypothesis, or \(H_0\).

Decision rules for testing hypotheses

Hypothesis tests are typically carried out in the following manner:

Specify \(H_0\) and \(H_a\), two mutually exclusive states of reality.
Obtain a test statistic \(T\), which contains information about \(\theta\).
Specify a critical value \(c\) and corresponding rejection region (RR).
If \(T\) falls in RR, reject \(H_0\). Otherwise fail to reject.

Decision errors

It’s of course possible that we make the incorrect decision, given the true state of reality.
The types of errors we can make under the two possible states of reality (the “null reality” and the “alternative reality”) can be summarized in 2x2 table format:

Decision	\(H_0\) true	\(H_0\) false
Reject \(H_0\)	Type-I Error	Correct decision
Fail to reject \(H_0\)	Correct decision	Type-II error

Decision probabilities

The probabilities of the two error types we notate with \(\alpha\) and \(\beta\).
\(\alpha\):
- The probability of committing a Type-I error, i.e. \(P(Reject\ H_0| H_0\ true)\).
- Equivalently: \(P(T \in RR | H_0\ true)\).
- Often referred to as the significance level or the size of the test.
\(\beta\):
- The probability of committing a Type-II error is \(P(Fail\ to\ reject\ H_0|H_0\ false)\).
- Equivalently: \(P(T \not \in RR | H_0\ false)\).
The power of a test is \(1-\beta = P(Reject\ H_0|H_0\ false)\).

The \(\alpha\)-\(\beta\) relationship

Implication: Power can always be trivially increased if \(\alpha\) is increased
- Consider: what sort of “brain dead” test would have \(Power=1\)? What is \(\alpha\) for this test?
In practice: set \(\alpha\) to something small, try to maximize power.

Finding \(\alpha\) and \(\beta\)

To find \(\alpha\):
- Specify sampling distribution of \(T\) under \(H_0\) state of reality
- Find \(P(T\in RR | H_0\ true)\)
To find \(\beta\)/power:
- Specify sampling distribution of \(T\) under an \(H_a\) state of reality
- Find \(\beta = P(T\not\in RR | H_a\ state\ of\ reality)\)
- \(Power = 1-\beta\), equivalently \(Power= P(T\in RR | H_a\ state\ of\ reality)\)

Example: balls in bucket

“Population”: a bucket with 10 balls.
\(H_0\) bucket: 5 red and 5 blue balls.
\(H_a\) bucket: 10 balls, more than half are red.

Sample: 10 balls drawn with replacement from the bucket.
Test statistic \(T\): number of red balls in the draw.
Decision rule: \(c=8 \Rightarrow\) reject \(H_0\) if \(T \ge 8\).

What is \(\alpha\)?
What would we need to specify to find \(\beta\) and power?

Question 1:

Under \(H_0, T\sim BIN(10, 0.5)\).
\(P(T\ge 8) = \sum_{t=8}^{10}\binom{10}{t}0.5^t0.5^{10-t}\).

1-pbinom(7,10,0.5)

[1] 0.0546875

\(\therefore \alpha = 0.0547\)

Question 2:

To find \(\beta\) or power, we need to specify an \(H_a\) state of reality, of which there are multiple possibilities!
One such state: \(H_a\) bucket contains 6 red and 4 blue
Under this state: \(T\sim BIN(10, 0.6)\)
Then \(\beta = P(T\le 7) = \sum_{t=0}^{7}\binom{10}{t}0.6^t0.4^{10-t}\)

pbinom(7,10,0.6)

[1] 0.8327102

\(\therefore \beta = 0.8327\), Power = \(0.1673\)

Example: exponential rate

Suppose \(Y_1,...,Y_{15} \stackrel{i.i.d.}{\sim}EXP(\lambda)\)
Will use sample to test:

\[H_0: \lambda = 1/5\] \[H_a: \lambda < 1/5\]

Test statistic: \(T=Y_{(1)}\)
Decision rule: reject \(H_0\) if \(Y_{(1)} \geq c\) (why does this make sense?)

Tasks:

Find \(\alpha\) for \(c \in \{0.2, 0.8\}\).
How should we set \(c\) to make \(\alpha = 0.05\)?
If in fact \(\lambda = 1/25\), find \(\beta\) and the power if we use a size-\(\alpha=0.05\) test.
Find an expression for the power as a function of any \(\lambda_a\).

Task 1: finding \(\alpha\) for set \(c\)

To find \(\alpha\), need to specify distribution of test statistic \(Y_{(1)}\) under \(H_0\).
Have shown that, in general:

\[Y_{(1)}\sim EXP(n\lambda)\]

With \(n=15\), under \(H_0: \lambda = 1/5,\) we have:

\[Y_{(1)}\sim EXP(15\cdot 1/5)=EXP(3)\]

Task 1: finding \(\alpha\) for set \(c\)

When \(c = 0.2\):

\[\alpha = P(Reject|H_0\ true)\] \[=P(Y_{(1)} > 0.2) , Y_{(1)}\sim EXP(15\cdot 1/5)\]

\[ = e^{-0.2\cdot 3} = 0.5488\]

1-pexp(0.2, rate = 15*1/5)

[1] 0.5488116

When \(c = 0.8\):

\[\alpha = P(Reject|H_0\ true)\] \[=P(Y_{(1)} > 0.8), Y_{(1)}\sim EXP(15\cdot 1/5)\]

\[ = e^{-0.8\cdot 3} = 0.0907\]

1-pexp(0.8, rate = 15*1/5)

[1] 0.09071795

Task 2: Find \(c\) such that \(\alpha = 0.05\)

Under \(H_0: \lambda = 1/5\Rightarrow Y_{(1)} \sim EXP(15\cdot 1/5)\)

\[\alpha = P(Y_{(1)}> c) = e^{-c\cdot 3}\stackrel{set}{=}0.05\]

\[\Rightarrow c = -\frac{\ln(0.05)}{3} = 0.9985774\]

Task 3. Find \(\beta\) and power if \(\lambda = 1/25\)

Finding \(\beta\) and power for the size-0.05 test:

\[\beta = P(Fail\ to\ reject| H_0\ false) = P(Y_{(1)} < 0.9985774 | \lambda < 1/5)\]

There are infinitely many \(\lambda\) for which \(H_0\) is false, so we evaluate for one “\(H_a\)” reality; \(\lambda = 1/25\):

\[\beta = P(Y_{(1)} < 0.9985774), Y_{(1)} \sim EXP(15\cdot 1/25) \]

\[ = 1-e^{-0.9985774\cdot 0.6} = 0.4507\]

pexp(0.9985774, rate = 15/25)

[1] 0.4507197

Power = \(1-\beta\) = 0.5493

Task 4. Find power as function of \(\lambda_a\)

For our size-0.05 test:

\[Power = P(Reject| H_0\ false) = P(Y_{(1)} > 0.9985774 | \lambda =\lambda_a < 1/5)\]

For any \(\lambda_a <1/5\):

\[P(Y_{(1)} > 0.9985774) = e^{- 0.9985774 \cdot 15\cdot \lambda_a}\]

Interactive Desmos version: https://www.desmos.com/calculator/dbqkfi2ru7

Plotting power as function of \(\lambda_a\)

library(tidyverse)
ggplot() + 
  geom_function(fun=\(lam) 1-pexp(0.9985774, rate = 15*lam)) + 
  xlim(c(0.001, 1/5)) + 
  geom_hline(aes(yintercept = 0.05), linetype=2)+ 
  scale_y_continuous(breaks = c(0.05, seq(0.2,1,by=.2)))+
  labs(y='Power', x = expression(lambda[a]))+
  theme_classic(base_size=14)

Example: Beta shape

A sample of size \(n=1\) is drawn from a population with the following pdf:

\[f_Y(y) = (1+\theta) y^\theta; 0\leq y \leq 1.\]

Test:

\[H_0: \theta = 2\] \[H_a: \theta > 2\]

Tasks:

Should we reject \(H_0\) for large or small \(Y\)?
What is the size of the test if the decision rule is to reject for \(Y>3/4\)?
What is the RR for a size-5% test?
Find the power function of the size-5% test as a function of \(\theta_a\).

Task 1: Reject for large or small \(Y\)?

Since \(Y\sim BETA(\theta+1,1)\):

\[E(Y) = \frac{\theta+1}{\theta+2}\]

Larger values of \(\theta\Rightarrow\) larger expected \(Y\); makes sense to reject \(H_0\) in favor of \(H_a: \theta= ``large"\) for large values of \(Y\).

Task 2: Finding size when \(c=3/4\)

In general, for \(y\in (0,1)\):

\[P(Y > y) = \int_y^1 (1+\theta) t^\theta dt = 1-y^{\theta+1}\]

Under \(H_0: \theta=2\),

\[\alpha = P(Reject|H_0\ true) = P(Y>3/4| \theta=2) = 1-(3/4)^{2+1} = 0.5781\]

Task 3: Finding \(c\) so that \(\alpha = 0.05\)

\[0.05 \stackrel{set}{=} P(Y>c| \theta=2) = 1-c^{2+1}\]

\[\Rightarrow c^3 = 0.95 \Rightarrow c = \sqrt[3]{0.95} = 0.9830476\]

Task 4: Finding power function

\[Power = P(Reject|H_0\ false) = P(Y>0.9830476| \theta=\theta_a > 2)\] \[= 1-0.9830476^{\theta_a+1}\]

library(tidyverse)
ggplot() + 
  geom_function(fun=\(theta) 1-0.9830476^(theta+1)) + 
  geom_hline(aes(yintercept = 0.05), linetype=2)+ 
  scale_y_continuous(breaks = c(0.05, seq(0.2,1,by=.2)))+
  scale_x_continuous(breaks = c(2, seq(50,200,by=50)),
                       limits = c(2,200))+
  labs(y='Power', x = expression(theta[a]))+
  theme_classic(base_size=14)