Lecture 7: Hypothesis Testing and Confidence Intervals

2026-03-01

Agenda and Announcements

  • Today

      - Hypothesis Testing and Confidence Intervals
      - Top Hat Quiz
  • Next class

      - Lab 3 completion

  • Announcements

      - CASA Registration
      - No class next Monday or Tuesday (March 10) 
      - I will record a midterm review and post an online pre-midterm quiz **for points**

Hypothesis testing

What is a hypothesis

A falsifiable statement about what we believe will happen based on the theory we are trying to test.

Falsifiability

  • We start from the assumption that our theory is wrong
  • Our hypothesis is called the

alternative hypothesis or H1

  • because it is the alternative to our assumption of no relationship

null hypothesis or H0

Testing the null hypothesis

  • Statistical tests show the degree of certainty that we can reject the null hypothesis
  • They show the probability that the alternative hypothesis is due to random chance
  • When the probability, p, is below our pre-determined threshold, we reject the null hypothesis
  • If we reject the null hypothesis, the alternative hypothesis is true, right?

NO!

  • If we reject the null hypothesis, the alternative hypothesis is true, right?

NO!!!!

  • If we reject the null, we infer that the alternative hypothesis (our hypothesis) is approximately true within the probability we have chosen.
  • If we reject the null, we can say there is strong evidence that the alternative hypothesis (our hypothesis) is true
  • We often test multiple hypotheses

How do we get there?

How do we get from a sample with a correlation to talking about testing a hypothesis for a population?

  • Tying sample statistics to population parameters (LLN)
  • Probability distributions
  • Tying our data to the probability distributions (CLT)

Statistics to parameters

  • We rely on the Law of Large Numbers and use sample statistics
  • With sufficient sample size and proper math, our sample mean is an unbiased estimator of the population mean
  • With CLT, \(s^2\) is an unbiased and consistent estimator of \(\sigma^2\)
  • For the normal distribution \(s^2\) is also the Minimum Variance Unbiased Estimator for \(\sigma^2\)

Probability distributions

The 68-95-99.7 Rule

            + Allows us to estimate probability based on distance from the mean
            + Applies to normal distribution
            + Basis for the actual decision rules
            

The 68-95-99.7 Rule

68-95-99.7 rule

From Distribution to Confidence

  • We can now bridge the gap between probability distributions and our sample data.

  • CLT ties our sample means to the normal distribution

  • Sample stats are unbiased estimators of population parameters

  • They are only point estimates, so we construct a confidence interval

  • confidence interval a range of values around the point estimate that we are X% confident contains the true population parameter

Math and philosophy of the confidence interval

  • Math and philosophy detail: The true population parameter is a fixed but unknown number, while our calculated interval is random but known with certainty. Our X percent confidence reflects the long-run reliability of our method. If we repeated this exact sampling process 100 times, X of the resulting intervals would successfully capture the true population parameter.

Calculating the Interval

  • we need three pieces of information:

      - point estimate = sample statistics
      - standard error = the standard deviation of the sampling distribution
      - critical value derived from our desired confidence level - for normal distribution this is a z-score
      - For 95% confidence, the critical z is 1.96
  • The general formula for a confidence interval is:

\[CI = \bar{x} \pm z \left( \frac{s}{\sqrt{n}} \right)\]
## CI of the variance

\[\frac{(n-1)s^2}{\chi^2_{\alpha/2}} \le \sigma^2 \le \frac{(n-1)s^2}{\chi^2_{1-\alpha/2}}\] - This use the Chi Square distribution which we will discuss after midterm

CI of the standard deviation

\[\sqrt{\frac{(n-1)s^2}{\chi^2_{\alpha/2}}} \le \sigma \le \sqrt{\frac{(n-1)s^2}{\chi^2_{1-\alpha/2}}}\] - Also uses the Chi-square distribution

Why the Chi-Square not the normal?

  • When we square distances from the mean to get the variance, the distribution changes
  • By eliminating all negative numbers, the distribution becomes skewed and looks like this:

Distribution Comparison

Authorship, License, Credits

Creative Commons License