Definition

Probability of rejecting the null hypothesis when it is false

Variable claim

\(\beta\): Probability of a Type II error, i.e. accept a false null hypothesis.

\(1-\beta\): Probability of rejecting a false null hypothesis, i.e. power.

\(\alpha\): Level size.

\(\mu_a\): Center distribution supported by \(H_a\).

\(Z\): Represents test statistic

\[Z\ =\ \frac{\bar{X}\ -\ 30}{\frac{\sigma}{\sqrt{n}}}\]

\(\frac{\mu_a - \mu_0}{\sigma}\): Effect size i.e. Difference in the means in SD units.

\(delta\): \(\mu_a - \mu_0\)

Introduction to Power

\[H_0:\ \mu=30\] \[H_a:\ \mu>30\]

Here, power is the probability that the true mean \(\mu\) is greater than \(1-\alpha\) quantile or qnorm(0.95). Explanation for this is as followed: Sample mean is too far from the mean (center) of the distribution hypothesized by \(H_0\), then we favor \(H_a\), its probability is 0.05.(With such low probability, it still happens)

Two kinds of distribution held by \(H_0\) and \(H_a\): \[\bar{X}\ ~\ N\ (\mu_0\ ,\frac{\sigma^2}{n})\] \[\bar{X}\ ~\ N\ (\mu_a\ ,\frac{\sigma^2}{n})\]

Futher Discussion by Image

Figure\ 1.\ \mu_0 = 30,\mu_a = 32

\[Figure\ 1.\ \mu_0 = 30,\mu_a = 32\]

   Power is area under blue curve to the right of vertical line.

Figure\ 2.\ \mu_0 = 30,\mu_a = 34

\[Figure\ 2.\ \mu_0 = 30,\mu_a = 34\]

   Nearly all is to the right of vertical line, indicating test is more powerful, i.e. There is a higher prob that it’s correct to reject the null hypothesis.

Figure\ 3.\ \mu_0 = 30,\mu_a = 33

\[Figure\ 3.\ \mu_0 = 30,\mu_a = 33\]

   Not as powerful as \(Figure\ 2.\)

Figure\ 4.\ \mu_0 = 30,\mu_a = 30

\[Figure\ 4.\ \mu_0 = 30,\mu_a = 30\]

   For the area under blue curve, the power, is exactly 5% or alpha. Red and Blue curve are layered together.

Another Perspective on power.

  1. Perspective 1

\(\mu\ >\ 30\ \Rightarrow\ Z\ >\ Z_{95}\)

Recall Z represents (above) test statistic.

This is equivalent to \(\bar{X} > Z_{95} * (\frac{\sigma} {\sqrt{n}}) + 30 = quantile\ 1\). It is the horizontal coordinate where vertical line falls, i.e. \(1\ -\ \alpha\) quantile on red line. Thus, pnorm(quantile 1 , mean = \(\mu_a\)), that represents the power.

  1. Perspective 2

\(H_a\) says that \(\mu > \mu_0\). Then \(power = 1 - beta = Prob ( \bar{X} > \mu_0 + z_{1-\alpha} * \frac{\sigma}{\sqrt{n}})\) assuming that \(\bar{X}\ ~\ N\ (\mu_a\ ,\frac{\sigma^2}{n})\), ${X} $ is determined by sample data collected

Power doesn’t need \(\mu_a\), \(\sigma\) and n individually. Instead only \(\frac{\sqrt{n}*(\mu_a - \mu_0) }{\sigma}\) is needed.

Summary on Power

Power is a function depending on specific value of an alternative mean, \(\mu_a\)(any value greater than \(\mu_0\)). If \(\mu_a\) is much bigger than \(\mu_0\), then power(prob) is bigger than if \(\mu_a\) is close to 30. As \(\mu_a\) approaches 30, the power approaches .

  1. Alpha Effect
Figure\ 5.\ Different\ Alpha\ level

\[Figure\ 5.\ Different\ Alpha\ level\]

  1. Sample size effect
Figure\ 6.\ Different\ Sample\ size

\[Figure\ 6.\ Different\ Sample\ size\]

Some R function examples

  1. Returns prob that the area under the blue curve to the right of the line
      pnorm(quantile 1, mean = 32, lower.tail = false)
      
      z<-qnorm(.95)
      pnorm(30+z, mean=30, lower.tail = FALSE)
      > 0.05
  1. Exmaple 2
    pnorm(30+z, mean = 32, lower.tail = FALSE)
    >0.63876

This means much more powerful when sample mean is quite different from the mean hypothesized by \(H_0\), thus prob of rejecting it is much bigger.

  1. Greater SD means more variability on the data, leading to the test less powerful
    pnorm(30+z, mean = 32, sd = 1, lower.tail = FALSE)
    >0.63876

    pnorm(30+z*2, mean = 32, sd = 2, lower.tail = FALSE)
    >0.259511

Besides, The power of one-sided test is greater than two-sided test, as \(\alpha\) is greater than \(\frac{\alpha}{2}\)

  1. Solve power
    power.t.test(n = 16, delta = 2 / 4, sd=1, type = "one.sample",  alt = "one.sided")$power

    power.t.test(n = 16, delta = 2 , sd=4, type = "one.sample",  alt = "one.sided")$power

    power.t.test(n = 16, delta = 100 , sd=200, type = "one.sample",  alt = "one.sided")$power
    >0.6040329

Those three keep effect size the same, thus have same power. i.e. Distance between \(\mu_a\) and \(\mu_0\)

  1. Solve sample size for specified power
    t.test(power = .8, delta = 2 / 4, sd=1, type = "one.sample",  alt = "one.sided")$n
    t.test(power = .8, delta = 2, sd=4, type = "one.sample",  alt = "one.sided")$n
    power.t.test(power = .8, delta = 100 , sd=200, type = "one.sample",  alt = "one.sided")$n
    >26.13751
  1. Solve delta
    power.t.test(power = .8, n=26, sd=1, type = "one.sample",  alt = "one.sided")$delta
    >0.5013986

    t.test(power = .8, n=27, sd=1, type = "one.sample",  alt = "one.sided")$delta
    >0.4914855

If double Sd, to keep effect size constant, double numerator, that is double delta. Besides, if \(\mu_a = \mu_0\), then alpha = power .