Lecture 4: Hypothesis Test

Joel Correa da Rosa
January 11th 2017

Mechanism

Formulate the hypotheses: \( H_0 \) and \( H_1 \)
Determine the test statistic
Define the significance level \( \alpha \) (probability of type-I error)
Evaluate the critical region for the test statistic
Compare the observed test statistic with the values of the test in the critical region
Evaluate the p-value
Reject or not reject \( H_0 \)

Decision Errors

Type-I Error: Reject \( H_0 \) when \( H_0 \) is true
Type-II Error: Accept \( H_0 \) when \( H_0 \) is false

As an analogy to the court trial, the type-I error means to condemn an innocent.

Decision Errors (Probability)

\( \alpha \) : probability of type-I error
\( \beta \) : probability of type-II error
\( 1-\beta \): power of the test

The hypothesis test controls for type-I error (False-positives) while trying to minimize type-II error (False-negatives). The higher is the probability \( 1-\beta \), the higher is the power of the test.

The usual thresholds are: \( \alpha=0.05 \) and \( \beta=0.20 \).

Formulate Hypotheses

A hypothesis is a statement of belief about one or more population parameters (e.g. EAAT2 expression levels at the hippocampal synapse will be decreased in old mice compared to young mice).

\( H_0: \mu_1 - \mu_2 = 0 \)

\( H_1: \mu_1 - \mu_2 \neq 0 \)

\( \mu_1 \): EAAT2 average expression in the old mice population

\( \mu_2 \): EAAT2 average expression in the young mice population

EAAT2 (Excitatory Amino Acid Transporter 2)

Test statistic (Compare Two Independent Populations)

The test statistic is a random variable that will be observed when sampling from two independent populations: old mice and young mice.

\( t(x) = \frac{\bar{x}_1-\bar{x}_2-(\mu_1-\mu_2)}{\sigma_t} \)

if \( \sigma_t \) is known and \( x_1 \) and \( x_2 \) are normally distributed, t(x) follows the standard normal distribution (\( t(x)\sim \text{normal(0,1)} \)) .
if \( \sigma_t \) is unknown and \( x_1 \) and \( x_2 \) are normally distributed, \( \sigma_t \) has to be estimated from the data and t(x) follows the Student's t distribution.

Test statistic

\( t(x) = \frac{\bar{x}_1-\bar{x}_2-(\mu_1-\mu_2)}{\sigma_t} \) is called pivotal quantity (i.e. probability distribution of \( t(x) \) does not depend on unknown parameters)
Alternatively, the test statistic could be defined as \( t(x)=\bar{x}_1-\bar{x}_2 \), as a consequence, \( ~ \) \( t(x) \sim \text{normal}(\mu_1-\mu_2,\sigma_t) \), however it depends on unknown parameters.

Test statistic

\( \sigma_t^2 = \frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2} \)

if the variance is the same, i.e. \( \sigma_t = \sigma_1 = \sigma_2 \) (e.g. the same variability in old and young mice population), thus

\( \sigma_t = \sigma \sqrt{(\frac{1}{n_1} + \frac{1}{n_2})} \)

Test statistic

Collected data from past experiments indicated that the variability of EAAT2 expressions levels in mice was 1.41 standard deviations. Let's assume the same variability for old and young mice.

The experiment will observe two independent samples \( n_1=5 \) and \( n_2=5 \) young mice such that:

\( \sigma_t = 1.41 \times \sqrt{(\frac{1}{5} + \frac{1}{5})}= 0.892 \)

Critical Region

This is as region of unlikely outcomes for t(x) if \( H_0 \) is true. We define the unlikely outcomes as the extreme ones that occur with a low probability \( \alpha \), the level of significance. If the alternative hypothesis \( H_1 \) is bilateral, the extreme values are present in the two tails of the test statistic distribution.

The critical region will depend on:

null hypothesis
alternative hypothesis
significance level
variability of the random variables
sample size

Critical Region

Assuming \( H_0 \) to be true, i.e. \( \mu_1-\mu_2=0 \) and the variance to be known (\( \sigma_t \) = 0.892).

\( \bar{x_1}-\bar{x_2} \sim \text{normal}(0,\sigma_t) \)

\( \bar{x_1}-\bar{x_2} \sim \text{normal}(0,0.892) \)

The critical region is built such as \( Pr_{H_0}(t(x) \in CR)=\alpha \)

\( CR = (-\infty,-1.96*\sigma_t) \cup (+1.96*\sigma_t,+\infty) \)

\( CR = (-\infty,-1.96*0.892) \cup (+1.96*0.892,+\infty) \)

\( CR = (-\infty,-1.748) \cup (+1.748,+\infty) \)

Critical Region

plot of chunk unnamed-chunk-1

The observed test statistic

The two independent sample were collected and they returned: \( \bar{x_1} = 9.55 \) for \( n_1=5 \) old mice and \( \bar{x_2}=6.95 \) for \( n_2=5 \) young mice. Let's calculate the observed test statistic for \( H_0:\mu_1-\mu_2=0 \).

\( t(x)=\bar{x_1}-\bar{x_2}=2.6 \)

Observed Statistic vs. Critical Region

plot of chunk unnamed-chunk-2

P-value

The p-value is the probability of the test statistic being more extreme than its observed value, assuming that \( H_0 \) is true.

bilateral alternative hypothesis:

\( Pr_{H_0}(|t(x)|>t_{obs}) \)

one-sided alternative hypothesis:

\( Pr_{H_0}(t(x) > t_{obs}) \) or \( ~ \) \( Pr_{H_0}(t(x) < t_{obs}) \)

P-value

In our example, \( Pr_{H_0}(|t(x)|>t_{obs}) = Pr_{H_0}(t(x)>t_{obs})+Pr_{H_0}(t(x)<-t_{obs}) \)

that means: \( Pr_{H_0}(|t(x)|>t_{obs}) = Pr_{H_0}(t(x)>2.6)+Pr_{H_0}(t(x)<-2.6) \)

Remember that: \( t(x)=\bar{x}_1-\bar{x}_2 \sim \text{ Normal}(0,0.892) \)

P-value

The probability of having results more extreme than what was found in the sample is :

\( Pr_{H_0}(|t(x)|>t_{obs}) = Pr_{H_0}(|t(x)|>2.6) = 0.0036 \)

If our assumptions are true, the probability of having a difference larger than 2.6 in both directions is 0.36%.

Decision

Reject \( H_0 \)

There is significant evidence for \( H_1 \), the average expression levels of EAAT2 are different at the hippocampal synapse of old mice compared to young mice.

Power

There is a infinite number of distributions associated with \( H_1 \)

plot of chunk unnamed-chunk-3

Power

\( H_1:\mu_1-\mu_2=1 \) \( ~ \) \( Pr_{H1}(t(x)\in CR) \)

plot of chunk unnamed-chunk-4

If \( H_1:\mu_1-\mu_2=1 \), the power to reject \( H_0 \) is \( ~ \) 0.20125

Power

\( H_1:\mu_1-\mu_2=2 \) \( ~ \) \( Pr_{H_1}(t(x)\in CR) \)

plot of chunk unnamed-chunk-5

Power

If \( H_1:\mu_1-\mu_2=2 \), the power to reject \( H_0 \) is \( ~ \) 0.61038