Introduction

Interval estimation is a fundamental technique in Statistics used to estimate population parameters. It provides a range (or interval) of plausible values rather than a single “best guess.”

Confidence Levels

A 95% confidence interval means that if we repeated the sampling process 100 times, we would expect about 95 of those intervals to contain the true population mean.

Formula (standard deviation is unknown)

One common approach is constructing a confidence interval (CI) for the population mean \(\mu\).

Using the t-distribution when the population standard deviation is unknown, the formula is: \[ \bar{x} \pm t_{\frac{\alpha}{2}, n-1} \cdot \frac{s}{\sqrt{n}} \] where:

  • \(\bar{x}\): sample mean
  • \(s\): sample standard deviation
  • \(n\): sample size
  • \(t_{\frac{\alpha}{2}, n-1}\): critical value from t-distribution

Confidence Interval Notation

If the true population mean is \(\mu\), a 95% confidence interval for \(\mu\) is generally written as:

\[ (\bar{x} - \mathrm{ME}, \ \bar{x} + \mathrm{ME}), \] where

\[ \mathrm{ME} = t_{\frac{\alpha}{2},\,n-1} \cdot \frac{s}{\sqrt{n}} \]

is the margin of error.

Code Snippet

# Compute margin of error given some data
n <- 100
some_data <- rnorm(n, mean = 50, sd = 10)
sample_mean <- mean(some_data)
s <- sd(some_data)

error_95 <- qt(0.975, df=n-1) * (s / sqrt(n))
## Sample mean: 49.91173
## Approx 95% Margin of Error: 1.907439
## 95% CI: ( 48.00429 , 51.81916 )

ggplot Visualization #1 (Histogram)

3D Plotly Visualization

ggplot Visualization #2 (Line Plot)

Conclusion

  • Interval estimation helps you quantify uncertainty around point estimates.

  • Confidence intervals are widely used in science, engineering, and business to make informed decisions.

  • Larger sample sizes reduce the margin of error, making confidence intervals narrower.