Foundations of Statistical Inference

Harriet Goers

Why We Need Statistical Inference

  • Suppose women give Dems an average rating of 54, men give 49.
  • Is the 5-point gap real or just sampling noise?
  • Inference tells us: how confident can we be that our result reflects the population?

Key Concepts

  • Population: All cases (e.g., all U.S. voters)
  • Parameter: True population value (unknown)
  • Sample: Subset we actually study
  • Statistic: Estimate of the parameter based on the sample

Why Random Sampling Matters

Only random samples give every member of the population an equal chance of selection.

Random Sampling Error

  • Even with good samples, estimates vary.
  • This variation is random sampling error, not a mistake.

The Standard Error

  • Measures how much a statistic is expected to vary due to sampling error.
  • Decreases as sample size increases.

Central Limit Theorem

  • Sampling distributions of means form a bell curve.
  • Holds regardless of population shape, if n is large enough.

Z Scores and the Normal Curve

  • A Z score = how many SEs away a value is from the mean
  • Helps us use the empirical rule:
    • 68% within 1 SE
    • 95% within 2 SEs

Confidence Intervals

  • 95% CI = sample statistic ± 1.96 × SE
  • Shows range where we expect population value to fall

Sample Size and Diminishing Returns

  • More data = less error, but benefits shrink with size

The t-Distribution for Small Samples

  • For small samples, use the t-distribution instead of normal
  • Thicker tails, wider CIs

Wrap-Up

  • Random sampling lets us infer from sample to population
  • Standard error and confidence intervals express uncertainty
  • Larger samples and less variation = more precise estimates
  • Use t-distribution when sample sizes are small