What is Point Estimation?

Point estimation is the process of using a single value from sample data to estimate an unknown population parameter.

Examples:

  • using a sample mean to estimate a population mean
  • using a sample proportion to estimate a population proportion
  • using a sample variance to estimate a population variance

In practice, point estimation is often the first step before confidence intervals or hypothesis testing.

Population Parameter vs Sample Statistic

A population parameter is a fixed but usually unknown value.

Examples:

  • population mean: \(\mu\)
  • population variance: \(\sigma^2\)
  • population proportion: \(p\)

A sample statistic is computed from data and used to estimate the parameter.

For example, the sample mean \(\bar{x}\) estimates the population mean \(\mu\).

Mathematical Definition

For a sample \(x_1, x_2, \dots, x_n\), the sample mean is

\[\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i\]

This is a point estimator for the population mean:

\[\hat{\mu} = \bar{x}\]

The value \(\bar{x}\) is called a point estimate because it gives one single best guess for the parameter.

Properties of a Good Estimator

A good point estimator should have useful statistical properties.

Unbiasedness

\[E(\hat{\theta}) = \theta\]

This means the estimator gets the correct value on average.

Consistency

\[\hat{\theta} \to \theta \text{ as } n \to \infty\]

This means the estimator gets closer to the true parameter as sample size increases.

Example Dataset: airquality

For this presentation, we use the built-in airquality dataset in R.

It contains daily air quality measurements in New York from May to September 1973.

Important variables:

  • Temp: temperature in degrees Fahrenheit
  • Ozone: ozone concentration
  • Wind: wind speed
  • Month: month of observation

We will use this dataset to estimate the average daily temperature.

head(airquality)
##   Ozone Solar.R Wind Temp Month Day
## 1    41     190  7.4   67     5   1
## 2    36     118  8.0   72     5   2
## 3    12     149 12.6   74     5   3
## 4    18     313 11.5   62     5   4
## 5    NA      NA 14.3   56     5   5
## 6    28      NA 14.9   66     5   6

Point Estimate of Average Temperature

We estimate the population mean temperature using the sample mean of Temp.

mean_temp <- mean(airquality$Temp, na.rm = TRUE)
mean_temp
## [1] 77.88235

The sample mean temperature is the point estimate for the average temperature in this population of observed summer days.

Why Missing Values Matter

Some variables in airquality contain missing values, so we must handle them carefully.

For example, if we want the mean ozone level, we should use na.rm = TRUE.

mean_ozone <- mean(airquality$Ozone, na.rm = TRUE)
mean_ozone
## [1] 42.12931

This is also a point estimate: it estimates the average ozone level from the observed sample data.

Distribution of Temperature (ggplot)

This histogram shows how daily temperatures are distributed.
The point estimate of the mean summarizes the center of this distribution.

Temperature by Month (ggplot)

This boxplot shows that temperature varies across months even though a single mean estimate summarizes the overall dataset.

Interactive Plot (plotly)

This interactive plot helps visualize relationships between variables while point estimates summarize each variable individually.

R Code Example

Below is a simple piece of R code that computes a point estimate.

# Point estimate for mean temperature
mean(airquality$Temp, na.rm = TRUE)
## [1] 77.88235

This code calculates the sample mean, which serves as the point estimate for the population mean temperature.

Conclusion

Point estimation is one of the most fundamental ideas in statistics.

Main takeaways:

  • a point estimate gives one numerical estimate of an unknown parameter
  • sample statistics such as the mean are commonly used point estimators
  • good estimators should be unbiased and consistent
  • point estimation is simple and widely used in real-world data analysis

Thank you.