The Basic Framework of Hypothesis Testing

The Null Hypothesis

We focus on the null hypothesis and decide whether or not we are able to reject it and accept an alternative hypothesis instead.

The concept of the null hypothesis is not very intuitive. A better term might be “The claim of nothingness.” That probably isn’t a legitimate word, but it may be clearer than “null hypothesis.”

What are some typical null hypotheses?

  • The value of this parameter is zero.

  • The mean value of this population is no different from what we have always assumed.

  • There is no difference between the mean values of these two populations.

  • There is no difference between the probability of success and of failure. In other words the probability of success is .5.

  • There is no difference between the effect of this drug and that of a placebo.

The Alternative Hypothesis

The alternative is what we will accept if we decide to reject the null hypothesis. There are two possibilities:

  • The parameter is simply not the value assumed in the null hypothesis, but we make no assumption about the direction of the difference. This is called a two-sided alternative.

  • The parameter differs from the value assumed in the null hypothesis in a specific direction. This is called a one-sided alternative.

The choice of the alternative must be made before data is analyzed. This is hard to do in the classroom environment where the results of the data analyis are presented in the problem statement.

The Process

This is taken from slide 151 of the CMU Statistical Reasoning course.

  1. State the null and alternative hypotheses.

  2. Collect relevant data from a random sample and summarize them (using a test statistic).

  3. Find the p-value, the probability of observing data like those observed assuming that \(H_{o}\) is true.

  4. Based on the p-value, decide whether we have enough evidence to reject \(H_{o}\) (and accept \(H_{a}\)), and draw our conclusions in context.

Examples

For each of these state the null hypothesis and the alternative hypothesis. Note that to mimic the real-world process we must ignore the results of any data analysis at this stage.

  1. A professional gambler is concerned about the possibilty that a coin used in flipping rituals might not be fair. She really doesn’t have any particularly suspicion. She’s just uncomfortable that it’s never been formally examined. She asks that the coin be tested by flipping it many times to see if it really shows heads 50% of the time.

  2. Another professional gambler is concerned that a coin has been showing heads more frequently than it should. He asks to have it examined for fairness by flipping it many times to see if it is really fair.

  3. New York is known as “the city that never sleeps”. The normal amount of sleep for the US population is eight hours per night.

  4. A university president is concerned that students are extending break periods by staying home extra days after holidays. On a normal monday, 10% of students will be absent from class. He asks that professors take role in class the monday after Thanksgiving and report the results to him.

Mechanics.

Let’s start with a population with known characteristics. The function rnorm() will return a sample drawn from a normal distribution with any given mean and standard deviation. The default values are 0 for the mean and 1 for the standard deviation. So rnorm(1) will produce a single number drawn form the standard normal distribution. rnorm(10) will produce a sample of 10 numbers drawn from this distribution. We would like to test the hypothesis that the mean (\(\mu\)) of the population we’re drawing from is really 0.

In thinking this far, we have completed step 1 of the standard process and decided on the following: Our null hypothesis, \(H_{0}\) is \(\mu = 0\). Our alternative hypothesis, \(H_{a}\) is \(\mu \neq 0\).

Step 2 is to collect some relevant data summarize them. The simplest thing to do is to just take a sample of size 1. This value will be our test statistic. Since we’re drawing from a standard normal distribution, we’ll call the result z.

z <- rnorm(1)

Since we only took one number, the summarization, such as it is, is complete. We can proceed to step 3, which is to compute the p-value. This is the probability that we would see a test statistic value like this if the null hypothesis were true. In our case, with a two-sided hypothesis, this is the probability that we would see a value this far or farther away from 0. We have to think carefully about how to deal with this. This step is complicated by the fact that the sign of the z-score determines which of two simple formulas involving the pnorm() function should be used.

As an example, suppose that our hypothesis-testing procedure has produced a z of 1.7 or -1.7. These two results are equally distant from a hypothesized value of 0.

pval.pos <- 2*(1-pnorm(1.7))
pval.neg <- 2*pnorm(-1.7)
pval.pos
## [1] 0.08913093
pval.neg
## [1] 0.08913093

The two values are identical. To see why this works, think about the graph below which shows the probability that a standard normal random variable falls between -1.7 and +1.7.

Now, think about what the pnorm() function does. pnorm(-1.7) gives you the probability that a standard normal RV takes on a value to the left of -1.7. Geometrically, this is the small white area to the left. pnorm(1.7) gives you the probability that a standard normal RV takes on a value to the left of +1.7. Geometrically, this is the entire red area plus the left-hand white area. To get the area of the small white region on the right, we need 1 - pnorm(1.7). Note that pnorm(-1.7) is the same value as 1 - pnorm(1.7). The area of the white region on the left is the same as the area of the white region on the right.

What we really want is the combined area of the two white regions. This is our p-value, the probability that a standard normal RV takes on a value more than 1.7 units from the mean.

Could we eliminate the need to stop and make a choice based on the sign of the z-score. Yes, we could refer to the absolute value of our z-score and use either of two expressions. Let’s try this out.

First give z a positive value.

z <- 1.7
2*(1-pnorm(abs(z))) # The first expression
## [1] 0.08913093
2*pnorm(-abs(z))    # The second expression
## [1] 0.08913093

Now give z a negative value.

z <- -1.7
2*(1-pnorm(abs(z)))
## [1] 0.08913093
2*pnorm(-abs(z))
## [1] 0.08913093

Note that either expression yields the same value in both cases.

Step 4 is to refelect on the p-value and decide whether it is so low that we should reject the null hypothesis in favor of the alternative. A common criterion is to reject the null hypothesis whenever the p-value is less than 5%. In this case, we say that we are using the 5% level of significance. Let’s try this a few times using the single draw from the standard normal distribution. We are testing the hypothesis that the population mean is 0 and we know that the population standard deviation is 1.

z <- rnorm(1)
p.value <- 2*pnorm(-abs(z))
p.value 
## [1] 0.54796

We know that the null hypothesis is true in this case because we trust the authors of the rnorm() function. However we will reject the null hypothesis 5% of the time if we repeat this experiment a large number of times. This is called a Type 1 error.

What would happen if in fact the null hypothesis is not true. Suppose that the value of z is not coming from a standard normal distribution, but from one with a different mean, but the same standard deviation of 1. We can try this by just adding a constant to z or calling rnorm with a different mean. Let’s try this with a value of the mean close to 0.

z <- rnorm(1,mean=.01,sd=1)
p.value <- 2*pnorm(-abs(z))
p.value 
## [1] 0.1109055

Moost of the time we fail to reject the null hypothesis even though it is false. This is called a type 2 error. What we’ve seen is that if the true value of the mean is close to the hypothesized value of the mean, the probability of type 2 error is large. This is why we shouldn’t say that we “Accept the Null Hypothesis.” Instead we should say that we “Don’t have enough evidence to reject the null hypothesis” or “fail to reject the null hypothesis.”

However, if the true value is farther away from the hypothesized value, the probability of a type 2 error is reduced. Let’s try an example.

z <- rnorm(1,mean=3,sd=1)
p.value <- 2*pnorm(-abs(z))
p.value 
## [1] 0.0009083926

The General Case With a Known Standard Deviation

In this situation we claim to know the standard deviation, %$ of the population. We have a sample of size \(n\) and a sample mean, \(\bar{x}\). We want to test the hypothesis that the true mean is a hypothesized value \(\mu\) against the 2-sided alternative that the true mean is not \(\mu\). Under most conditions (Discussed later), we can compute a z-score as a test statistic, which has a standard normal distribution.

\[z=\frac{\bar{x}-\mu}{\sigma_{bar{x}}}\]

We can then obtain the p-value as we did above.

The value of \(\sigma_{bar{x}}\) is computed as \(\frac{\sigma}{\sqrt{n}}\).

The following code snippet does the work.

# Replace the example values as necessary

xbar <- 135    # Sample mean
mu <- 134      # Hypothesized value of the mean
sigma <- 15    # Known population standard deviation
n <- 100       # sample size
sided = 2      # Specification of the alternative type

# Now do the work
sd.xbar <- sigma/sqrt(n)
z <- (xbar - mu)/sd.xbar
p.value <- sided * pnorm(-abs(z))

# Display the p-value.

p.value
## [1] 0.5049851

Use this code to solve the follwong problems.

  1. A sample of size 200 yields a mean of 11.2. This is taken from a population with a known standard deviation of .4. Test the null hypothesis that the true mean value is 11 against the alternative that it is not 11.

  2. A sample of size 1000 yileds a mean of 25.6. The known population standard deviation is 1.2. Test the null hypotheis that the true mean is 25.2 agianst the alternative that the true mean is greater than 25.2.

The General Case with an Estimated Standard Deviation

In the case where we don’t know the population standard deviation, we will have to estimate it from the sample we have. The computation is almost identical to the earlier case, but instead of a z-score, we call what we get a t-statistic. Then instead of a standard normal distribution, we have soemthing with a t distribution. The t distribution is very similar to the standard normal, but it requires that we specify the “degrees of freedom.” In this case, we use the formula \(df = n - 1\). When the sample size, \(n\) is large the difference between the t distribution and the standard normal distribution disappears. Follow the convention from earlier, when we have an estimated standard deviation, we refer to it as \(S\) rather than \(\sigma\).

The following code incorporates these changes.

# Replace the example values as necessary

xbar <- 135    # Sample mean
mu <- 134      # Hypothesized value of the mean
s <- 15    # Known population standard deviation
n <- 100       # sample size
sided = 2      # Specification of the alternative type

# Now do the work
sd.xbar <- s/sqrt(n)
t <- (xbar - mu)/sd.xbar
p.value <- sided * pt(-abs(t),df=n-1)

Hypothesis Testing for a Single Proportion

Here is the scenario. We have a sample of size \(n\) from a large population and we have estimated the proportion of cases in the sample that meet some criterion. This estimated proportion is denote \(\hat{p}\). We wish to test the null hypothesis that the true population proportion is a specific value denoted \(p_{0}\). Under certain conditions, the quantity \(z\) has a standard normal distribution. \(z\) is computed as:

\[z=\frac{\hat{p}-p_{0}}{\sqrt{\frac{p_{0}(1-p_{0})}{n}}}\]

What are the “certain conditions” which allow us to assume that \(z\) will have a standard normal distribution. There are two conditions.

\[n* p_{0}\geq 10\] and \[n* (1-p_{0})\geq 10\]

The following code snippet constructs z and obtains the p-value.

# Here are the inputs which can be changed to reuse the snippet.
n <- 100       # Number of trials (sample size)
phat <- .6    # Poroportion of sample cases meeting the definition
p0 <-  .5       # The value of p under the null hypothesis
sided <- 2    # Specification of the alternative  
 
# Construct z  
z <- (phat - p0)/sqrt( (p0*(1-p0) )/n )

# Compute and display the p-value
pvalue <- sided * pnorm(-abs(z))
pvalue
## [1] 0.04550026