Hw3

2025-11-17

Point Estimation: Mean, Variance, and MLE

Welcome to this presentation on point estimation, a core concept in statistical inference.

We will explore how to estimate population parameters using sample data.

What Is Point Estimation?

A point estimator provides a single numerical estimate of a population parameter.
Common parameters:
- Population mean (μ)
- Population variance (σ²)
Examples of point estimators:
- Sample mean ( {x} )
- Sample variance ( s² )
- Maximum likelihood estimators (MLEs)

Sample Mean

The sample mean is a point estimator of the population mean \(\mu\).

It is computed as: \[ \hat{\mu} = \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i \]

Properties:

Unbiased estimator of the population mean
Minimum variance among all unbiased linear estimators (Gauss–Markov theorem)

Sample Variance

The sample variance estimates the population variance \(\sigma^2\).

For a sample \(x_1, x_2, \ldots, x_n\): \[ \hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2 \]

Note:

This is the MLE variance estimator.
The unbiased estimator uses \(\frac{1}{n-1}\) instead of \(\frac{1}{n}\).

R Code: Generating Data

set.seed(123)
data <- rnorm(100, mean = 50, sd = 10)

R Code: Computing Sample Mean

mean(data)

## [1] 50.90406

R Code: Computing Sample Variance

var(data)

## [1] 83.32328

R Code: Maximum Likelihood Estimation

library(MASS)
mle_fit <- fitdistr(data, "normal")
mle_fit

##       mean          sd    
##   50.9040591    9.0824033 
##  ( 0.9082403) ( 0.6422229)

Histogram with Estimated Mean

library(ggplot2)
sample_mean <- mean(data)
ggplot(data.frame(x = data), aes(x = x)) +
  geom_histogram(binwidth = 5, fill = "beige", color = "black") +
  geom_vline(aes(xintercept = sample_mean), 
             color = "tan", linewidth = 1) +
  labs(title = "Histogram of Sample Data", 
       x = "Value", y = "Count")

Sampling Distribution: Code

set.seed(123)
sample_means <- replicate(1000, 
                          mean(rnorm(50, mean = 50, sd = 10)))

Sampling Distribution: Plot

library(ggplot2)
ggplot(data.frame(mean = sample_means), aes(x = mean)) +
  geom_histogram(binwidth = 1, fill = "beige", color = "tan") +
  labs(title = "Sampling Distribution of the Sample Mean",
       x = "Sample Mean", y = "Frequency")

Interactive Plotly Plot

library(plotly)

## 
## Attaching package: 'plotly'

## The following object is masked from 'package:ggplot2':
## 
##     last_plot

## The following object is masked from 'package:MASS':
## 
##     select

## The following object is masked from 'package:stats':
## 
##     filter

## The following object is masked from 'package:graphics':
## 
##     layout

plot_ly(x = data, type = "histogram") %>%
  layout(
    title = "Interactive Histogram of Sample Data",
    xaxis = list(title = "Value"),
    yaxis = list(title = "Count")
  )

Summary

Point estimation provides single-number estimates of population parameters.
The sample mean \(\bar{x}\) is an unbiased estimator of the population mean.
The sample variance \(\hat{\sigma}^2\) describes variability in the data and is the MLE for variance.
Maximum Likelihood Estimation (MLE) chooses parameter values that maximize the likelihood of observing the sample data.
Visualizations (histograms & sampling distributions) help us understand estimator behavior.